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Abstract. It is a classical result of Ginibre that the normalized bulk fc-point 
correlation functions of a complex n X n gaussian matrix with independent 
entries of mean zero and unit variance are asymptotically given by the deter- 
minantal point process on C with kernel K oa (z,w) := ^e~l z / 2 H">I /2+zw m 
the limit n — > oo. In this paper we show that this asymptotic law is universal 
among all random n X n matrices M n whose entries are jointly independent, 
exponentially decaying, have independent real and imaginary parts, and whose 
moments match that of the complex gaussian ensemble to fourth order. Anal- 
ogous results at the edge of the spectrum are also obtained. As an application, 
we extend a central limit theorem for the number of eigenvalues of complex 
gaussian matrices in a small disk to these more general ensembles. 

These results are non-Hcrmitian analogues of some recent universality re- 
sults for Hermitian Wigner matrices. However, a key new difficulty arises in 
the non-Hermitian case, due to the instability of the spectrum for such ma- 
trices. To resolve this issue, we the need to work with the log-determinants 
log | det(M n — zo)| rather than with the Stieltjes transform ^ trace(M n — zo) -1 , 
in order to exploit Girko's Hcrmitization method. Our main tools are a four 
moment theorem for these log-determinants, together with a strong concen- 
tration result for the log-determinants in the gaussian case. The latter is es- 
tablished by studying the solutions of a certain nonlinear stochastic difference 
equation. 

With some extra consideration, we can extend our arguments to the real 
case, proving universality for correlation functions of real matrices which match 
the real gaussian ensemble to the fourth order. As an application, wc show 
that a real n X n matrix whose entries are jointly independent, exponentially 
decaying, and whose moments match the real gaussian ensemble to fourth 
order has v/ + o{^/n) real eigenvalues asymptotically almost surely. 
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1. Introduction 

Let M n be a random n x n matrix with complex entries, which is not necessarily 
assumed to be Hermitian, and can be either a continuous or discrete ensemble of 
matrices. Then, counting multiplicities, there are n complex (algebraic) eigenvalues, 
which we enumerate in an arbitrary fashion as 

Ai(M„),...,A„(M„) g C. 

One can then define, for each 1 < k < n, the k-point correlation function 

=pW[M n ] :C fe ^K+ 

of the random matrix ensemble M n by requiring that 

/ F(zi, . . .,z k )p^\zu ...,z k ) dzi ...dz k 

= E F(A il (M n ),...,A ifc (M„)) 
i<i 1 ,...,i k <n, distinct 

for all continuous, compactly supported test functions F, where dz denotes Lebesgue 
measure on the complex plane C. Note that this definition does not depend on the 
exact order in which the eigenvalues of M n are enumerated. 

If M n is an absolutely continuous matrix ensemble with a continuous density 
function, then is a continuous function; but if M n is a discrete ensemble then 
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p( fe ) is merely a non-negative measure 1 . In the absolutely continuous case with a 
continuous density function, one can equivalently define p£\z\, ...,Zk) for distinct 
Z\ , . . . , Zk to be the quantity such that the probability that there is an eigenvalue 
of M n in each of the disks {z : \z — Zi\ < e} for i = 1,... ,k is asymptotically 
{p ( n\zi, ...,z k ) + o(l))(7T£ 2 ) fc in the limit e -> 0+. 

We note two model cases of continuous matrix ensembles that are of interest. The 
first is the real gaussian matrix ensemble 2 , in which coefficients ^ are independent 
and identically distributed (or iid for short) and have the distribution N(0, 1)r 
of the real gaussian with mean zero and variance one. We will discuss this case 
in more detail later, but for now we will focus instead on the simpler and better 
understood case of the complex gaussian matrix ensemble, in which the £jj are iid 
with the distribution of a complex gaussian N(0, l)c with mean zero and variance 
one (or in other words, the probability distribution of each £,j is ^e~l z l dz, and 
the real and imaginary parts of independently have the distribution N(0, 1/2)r). 
As is well known, the correlation functions of a complex gaussian matrix are given 
by the explicit Ginibre formula [26] 

(2) Pn\zu ■■■,Zk) = det(K n (zi, Zj))i< it j< k 
where K n : C x C -> C is the kernel 

(3) ^4^):= 1 e-* |!+H!)/2 i: ( * )i . 

3=0 J 

In particular, one has 

(4) p«(z)=if„(v)= e-W 2 V^_ 

IT * — ' V. 

and thus (by Taylor expansion of e I ^ 1 2 ) one has the asymptotic 

P { n\Vnz) -> -1| Z |<1 

asn->oo for almost every zeC. This gives the well-known circular law for com- 
plex gaussian matrices, namely that the empirical spectral distribution of -^M n 

converges (in expectation, at least) to the circular measure ^1b(o,i) dz, where we 
use B(zq, r) := {z G C : \z — zq\ < r} to denote an open disk in the complex plane. 
Informally, this means that the eigenvalues of M n are asymptotically uniformly 
distributed on the disk 5(0, y/n). The circular law is also known to hold for many 
other ensembles of matrices, and for several modes of convergence. In particular, 
it holds (both in probability and in the almost sure sense) for random matrices 
with iid entries having mean and variance 1; see the surveys [53, 5] for further 
discussion of this and related results. Figures 2, 3 later in this paper illustrate the 



Here, we have abused notation by identifying a measure p„ (zi , . . . , zj.) dz\ . . . dz^. with its 
(k) 

density p n . 

2 Strictly speaking, the real gaussian matrix ensemble is only absolutely continuous with respect 
to Lebesgue measure on the space of real n X n matrices, rather than on the space of complex 
n X n matrices. However, both ensembles arc still continuous in the sense that any individual 
matrix occurs in the ensemble with probability zero. 
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circular law for two model instances of iid ensembles, namely the real gaussian and 
real Bernoulli ensembles. 

We also remark that from the obvious inequality 

'f l f<-tf-^ 

3=0 J ' 3=0 J ' 

and (4) we have the uniform bound 

1 



\K n {z,z)\ < 
S 2 )t 



TV 



for all z, and hence by positivity of p n (z, w) — K n (z, z)K n (w, w) — \K n {z, w)\ we 
also have 

(5) \K n {z,w)\< 1 

7T 

for all z, w. In particular, from (2) one has 

(6) 0<pW 1 _ Zk (w 1 ,...,w k )<C k 

in the case of the complex gaussian ensemble for all wi, . . . ,w k € C, all n, and 
some constant C' k depending only on k. (Indeed, from the Hadamard inequality 
one can take C k = n~ k k k / 2 , for instance.) This uniform bound will be technically 
convenient for some of our applications. We will also need an analogous bound for 
the real gaussian ensemble; see Lemma 11 below. 

Our first main result is to show a universality result of the fc-point correlation 
functions p£) Zl ,...,z k (toi, ■ ■ • , w k ), in the spirit of the "Four Moment Theorems" for 
Wigner matrices that first appeared in [56]. Very roughly speaking, the result is 
that (when measured in the vague topology), the asymptotic behaviour of these 
correlation functions for matrices with independent entries depend only on the first 
four moments of the entries, though due to our reliance on the Lindeberg exchange 
method, we will also need to require these matrices to match moments with the 
complex gaussian ensemble. To make this statement more precise, we will need 
some further notation. 

Definition 1 (Independent-entry matrices). An independent- entry matrix ensem- 
ble is an ensemble of random n x n matrices M n — (£,ij)i<ij< n , where the £,j 
are independent and complex random variables, each with mean zero and variance 
one; we call the £jj the atom distributions of M n . We say that the independent- 
entry matrix has independent real and imaginary parts if for each 1 < i,j < n, 
Re(£y), Im(£jj) are independent. We say that the matrix obeys Condition CI if 
one has 

P(|&l >t)<Cexp(-t c ) 
for some fixed C, c > (independent of n) and all i, j. 

If k > 0, we say that two independent-entry matrix ensembles M n — {Cij)i<i,j<n 
and M' n = (Cij)i<i,j<n have matching moments to order k if one has 

(7) ERe(^rim(6 J )" - ERe(^) a ini(4-) b 
whenever 1 < i, j < n, a, b > and a + b < k. 
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Our first main result is then as follows. 

Theorem 2 (Four Moment Theorem for complex matrices). Let M n , M n be independent- 
entry matrix ensembles with independent real and imaginary parts, obeying Con- 
dition CI, such that M n and M n both match moments with the complex gaussian 
matrix ensemble to third order, and match moments with each other to fourth or- 
der. Let k > 1 be a fixed integer, let Z\, . . . , Zk € C be bounded (thus \zi\ < C for 
all i = 1, . . . ,k and some fixed C > 0), and let F : C fe — » C be a smooth function, 
which admits a decomposition of the form 

m 

(8) F(wi,...,w k ) =^2Fi il (wi)...F itk (w k ) 

i=l 

for some fixed m and some smooth functions Fi_j : C — > C for i = 1, . . . , m and 
j = 1, . . . , k supported on the disk {w : \w\ < C} obeying the derivative bounds 3 

(9) |V a F 2 ,»| < C 

for all < a < 5, i = 1, . . . ,m, j — 1, . . . , k and w € C, and some fixed C . Let 

(k) (k) 

Pn ,pn be the correlation functions for M n , M n respectively. Then 

/ F(wi, . . . ,Wk)p£\Vnzi +W!,...,\/nzk+Wk) dwi...dw k 
Jc k 

F(wi, . . . ^^p^iVnzi +wi,..., \fnz k + w k ) dw\ . ..dw k + 0(n~ c ). 

for some absolute constant c > (independent of k). Furthermore, the implicit 
constant in the 0(n~ c ) notation is uniform over all z\, . . . , z k in the bounded region 
{z:\z\<C}. 

Remark 3. The regularity hypotheses on the test function F here are somewhat 
technical, but they are needed to obtain the uniform polynomial decay 0(nT c ) in the 
conclusion, which is useful for several applications. Note that by rescaling one could 
allow the bound C in (9) to be enlarged somewhat, to Cn c / 2k , without impacting 
the conclusion (other than to degrade the 0(n~ c ) error slightly to 0(nT c / 2 )). If 
one is only seeking a qualitative error term of o(l), then by applying the Stone- 
Weierstrass theorem, one only needs F to be continuous and compactly supported, 
instead of having a smooth factorization of the form (8); see the proof of Corollary 
7 below. Also, if F is smooth and compactly supported, then by using a partial 
Fourier expansion one can again obtain a polynomial decay rate 0(n~ c ) (with the 
implied constant depending on the bounds on finitely many derivatives of F). It is 
possible to improve the value of c somewhat by adding additional matching moment 
hypotheses, but then one also requires the derivative bounds (9) for a larger range of 
exponents a; we will not quantify this variant of Theorem 2 here. The requirement 
that M n , M' n match the complex gaussian ensemble to third order can be removed 
if z\ , . . . , Zk stays a bounded distance away from the origin, using an extremely 
recent result of Bourgade, Yau, and Yin [8]; see Remark 22. 

Theorem 2 is motivated by the phenomenon, first observed in [56], that the as- 
ymptotic local statistics of the spectrum of a random Hermitian matrix of Wigner 
type typically depend only on the first four moments of the entries; formalizations 



'See Section 3 for the definition of the a-fold gradient V°Fjj. 
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of this phenomenon are known as four moment theorems. In particular, Corollary 
7 is analogous 4 to the four moment theorems in [56, Theorems 11, 38]. 

Remark 4. The hypothesis of independent real and imaginary parts is primarily 
for reasons of notational convenience, and it is likely that this hypothesis could be 
dropped from our results. Note that when M n and M' n have independent real and 
imaginary parts, the moment matching condition (7) simplifies to 

ERc(^r=ERe(^) a 

and 

EIm(^) fc = EIm(^) b 
for 1 < i,j < n and < a, b < k. 

It is also likely that the exponential decay condition in Condition CI could be 
replaced with a bound on a sufficiently high moment of the entries. We will however 
not pursue these refinements here. The vague convergence in the conclusion is 
natural given that the ensemble M n is permitted to be discrete (so that could 
be a discrete measure, rather than a continuous function). In analogy with the 
Hermitian theory (see e.g. [58]), it is reasonable to conjecture that stronger modes 
of convergence become available if some additional regularity hypotheses are placed 
on the entries, but we will not pursue such matters here. 

We now discuss some applications of Theorem 2. The first application concerns the 
asymptotic behaviour of the fc-point correlation functions as n — > oo. In the case 
when M n is drawn from the complex gaussian ensemble, these asymptotics have 
been well understood since the work of Ginibre [26]. To recall these asymptotics 
we introduce the following functions. 

Definition 5 (Asymptotic kernel). For complex numbers z\, z 2 , w\, w 2 , define the 
kernel K o,z 1 ,z 2 ( w ii' w 2) by the following rules: 

(i) If zi ^ z 2 , then K 00iZliZ2 (w 1 ,w 2 ) ■= 0. 

(ii) If z\ = z 2 and \z\\ > 1, then K oc , ZuZ2 (wi,w 2 ) 

(iii) If z\ = z 2 and \z\\ < 1, then K OCtZuZ2 (w 1 ,w 2 ) 

(iv) If zi = z 2 and |zi| = 1, then K^^^wi, w 2 ) : 
I erf (-\/2(ziW2~ + wiz^))). 

Here 

erf(z) := -= / e^ 2 alt 

V 7 !" JO 

^Thanks to more recent results by many authors [16], [20], [54], [21], [22], [58], these results 
arc no longer the sharpest results available in the Wigner setting, as the moment matching condi- 
tions have now largely been removed, the exponential decay condition relaxed to a finite moment 
condition, and the bulk results extended to the edge; see the discussion in [58] or the surveys [15], 
[28], [44], [61] for surveys for more details. In view of these results, it is reasonable to conjecture 
the moment matching assumptions in Theorem 2 or Corollary 7 may be relaxed; see Remark 22 
for some very recent developments in this direction. 



:= 0. 

•= I e -|w>i| 2 /2-|u. 2 | 2 /2+tui«^ 
— l e -\w 1 \ 2 /2-\w 2 \ 2 /2+w 1 W^^ + 



UNIVERSALITY FOR NON-HERMITIAN MATRICES 



7 



is the usual error function, defined for all complex z, where the integral is over an 
arbitrary contour from to z. For complex numbers zi, . . . , z k ,Wi, . . . ,w k , define 
the correlation function 

In the model case when Z\, . . . , z k all avoid the unit circle {z £ C : \z\ — I}, the 
kernel simplifies to 

Koo, Zi , Zj (Wi,Wj) = lzi=Z;jl\zi\<lKoo(Wi, Wj) 

where 

K^w) := I e -N 2 /2-M 2 /2+zw 

7T 

The kernel can also be interpreted as the reproducing kernel for the orthogonal 
projection in L 2 (C) to (the closure of) the space of functions f(z) that become 
holomorphic after multiplication by e' z l / 2 , or equivalently to the closed span of 
z k e -\z\ 2 /2 for A; = 0,1,.... 

Lemma 6 (Kernel asymptotics). Let Z\, . . . , Zk, W\, . . . , Wk be fixed complex num- 
bers for some fixed k, and let M n be drawn from the complex gaussian ensemble. 
Then we have 5 

(10) p^X^/nzx + toi, . . . , Vnz k + w k ) = p { £ Zu ..., Zk {w u • ■ ■ ,w k ) + o(l). 

// none of the Z\ , . . . , z k lie on the unit circle, then we may improve the error term 
o(l) to 0(exp(— Sn)) for some fixed S > 0. 

Now suppose that z\, . . . , z k , toi, • • • , w k are allowed to vary in n, but that the 
z\, . . . , Wi, . . . , w k remain bounded (i.e. \zt\, \wi\ < C for some fixed C and all 1 < 
i < k) and the Z\,...,z k stay bounded away from the unit circle (i.e. \\zi\ — 1| > e 
for some fixed e > and all 1 < i < k). Then one still has the asymptotic (10). 
In other words, the decay rate of the error term o(l) in (10) is uniform across all 
choices of z\, . . . , z k ,w\, . . . ,w k in the ranges specified above. 

Proof. This is a well-known asymptotic (see e.g. [35], [37], or [7]). For sake of 
completeness, we have written a proof of these standard facts at Appendix B of the 
copy of this paper at arXiv : 1206 . 1893v3. □ 

From this lemma we conclude in particular that p^ Zl ,...,z k • • • , w k ) > for all 
k, zi, . . . , Zk, w\, . . . , w k , which (when combined with (5)) yields the uniform bound 

\Koc,,z 1 ,z 2 (wi,W 2 )\ < ~ 
7T 

for all z\, Z2, w\, w 2 G C. In particular, we have 

(11) 0<p^ Zi _ Zk ( Wl ,...,w k )<C k 

for all toi, . . . ,Wk € C and some constant C k depending only on k. 

Using Theorem 2, we may extend the above asymptotics for complex gaussian 
matrices to mo re general ensembles (including some discrete ensembles), as follows. 

5 See Section 3 for the asymptotic notational conventions we will use in this paper. 
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Corollary 7 (Universality for complex matrices). Let M n be an independent- entry 
matrix ensemble with independent real and imaginary parts, obeying Condition CI, 
and which matches moments with the complex gaussian matrix ensemble to fourth 
order. Then for any fixed ( i. e. independent ofn), fixed k > 1 and fixed z\ , . . . , Zk € 
C, and any fixed continuous, compactly supported function F : C fe — > C, one has 



In other words, the asymptotic (10) is valid in the vague topology for this ensemble. 
If F is furthermore assumed to be smooth, then we may improve the o(l) error term 
here to 0(n~ c ) for some fixed c > 0. 

Proof. From Theorem 2 and Lemma 6, we obtain Corollary 7 in the case when F 
admits a decomposition of the form given in Theorem 2 (and in this case the o(l) 
error can be improved to 0{n~ c )). The more general case of continuous, compactly 
supported F can then be deduced by using the Stone- Weierstrass theorem to ap- 
proximate a continuous F by an approximant F of the form (8) (and by using a 
further function of the form in Theorem 2 and (11) to upper bound the error). 
When F is smooth, one can replace the use of the Stone- Weierstrass theorem by a 
more quantitative partial Fourier series expansion of F (extended periodically in a 
suitable fashion), followed by a multiplication by a smooth cutoff function, taking 
advantage of the rapid decrease of the Fourier coefficients in the smooth case; we 
omit the standard details. □ 

Remark 8. Note that in contrast to the situation in Theorem 2, the parameters 
Zi, . . . , Zk in Corollary 7 are required to be fixed in n, as opposed to being allowed 
to vary in n. Related to this, the error term o(l) in Corollary 7 is not asserted to 
be uniform in the choice of z\,. . . ,z k , in contrast to the uniformity in Theorem 2. 

(k) 

Indeed, given that the limiting correlation function poc\ Zl ....,z k behaves discontin- 
uously in zi, . . . , Zk whenever two of the z% collide, or when one of the z% crosses 
the unit circle, one would not expect such uniformity in Corollary 7. Thus, while 
Corollary 7 describes more explicitly the limiting behavior (in certain regimes) of 
the correlation functions p^ k \ we regard Theorem 2 as the more precise statement 
regarding the asymptotics of these functions. 

In the Hermitian case, Four Moment Theorems can be used to extend various 
facts about the asymptotic spectral distribution of special matrix ensembles (such 
as the gaussian unitary ensemble) to other matrix ensembles which obey appropriate 
moment matching conditions. Similarly, by using Theorem 2, one may extend some 
facts about eigenvalues of complex gaussian matrices can now be extended to iid 
matrix models that match the complex gaussian ensemble to fourth order, although 
in some "global" cases the extension is only partial in nature due to the "local" 
nature of the four moment theorem. Rather than provide an exhaustive list of such 
applications, we will present just one representative such application, namely that 
of (partially) extending the following central limit theorem of Rider [39] : 




F(wi, . . . ,w k )p^\^/nzi +wi,...,y/nzk+Wk) dwi...dw k 




(wi, . . .,w k ) dw\ . . .dw k + o(l). 
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Theorem 9 (Central limit theorem, gaussian case). Let M n be drawn from the 
complex gaussian ensemble. Let r > be a real number (depending on n) such that 
1/r, r/n 1 / 2 = o(l). Let zq be a complex number (also depending on n) such that 
\ z o\ < (1 — s)\/n for some fixed e > 0. Let N B / ZOtT .\ be the number of eigenvalues of 
M n in the ball B{zo 1 r) := {z G C : \z — z \ < r}. Then we have 

in the sense of distributions. In fact, we have the slightly stronger statement that 

TV J O \ 

N B (z ,r) - r \ k 



(12) E y j -> E.V((). 1) 

for all fixed natural numbers k>0. 



Proof. From the general Costin-Lcbowitz central limit theorem for determinantal 
point processes [12], [47], [48] we know that 

NB( Z0 ,r) ~ EAf B(z0;T .) 

— — — r— > iV(U, 1) R 

(Var^ B(z0ir) ) 1 /2 

provided that VariV B ( Z0)r ) — > oo; indeed, an inspection of the proof in [48] gives 
the slightly stronger assertion that 

/ N B ( Z0 ,r) -EN B{zo , r) \ k 

E { {v,rN BM yn ) -^^(0,1), 

for any fixed k > 0. Thus it will suffice to establish the asymptotics 

vN B(Z(hr) = (i + o(i)y 

and 

VarN BM = (1 + o(l))7T-^ 2 r. 
Using (1), (2), one can write the left-hand sides here as 



/ K n (z,z) dz 

JB(z .r) 



and 



/ K n (z, z) dz- j 

JB(z ,r) JB{z ,r) JE 



\K n (z, w)\ 2 dzdw 

B(z ,r) 



respectively. By Lemma 6, the former expression converges to J B r ZQ r \ ^ dz = r 2 . 
Lemma 6 also reveals that the second expression is asymptotically independent 
of zq, and so one may without loss of generality take z — 0. But then the re- 
quired asymptotic follows from [39, Theorem 1.6] (after allowing for the different 
normalisation for M n in that paper). □ 



Using Theorem 2, we may extend this result to more general ensembles, at least 
in the small radius case: 
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ECDF Plot for the Number of Eigenvalues (in the circle of radius 1/3) 




Figure 1. The cumulative distribution function for the number 
of eigenvalues in the disk -B(0, y/n/3) of real gaussian and real 
Bernoulli matrices of size 10, 000 x 10, 000, after normalizing the 
mean by n/9 and variance by \fn. Thanks to Ke Wang for the 
data and figure. 

Corollary 10 (Central limit theorem, general case). Let M n be an independent- 
entry matrix ensemble with independent real and imaginary parts, obeying Condi- 
tion CI, such that M n matches moments with the complex gaussian matrix ensem- 
ble to fourth order. Then the conclusion of Theorem 9 for M n holds provided that 
one has the additional assumption r < n"^ 1 ' ■ 

We prove this result in Section 6.3. The restriction to small radii r < appears 
to be a largely technical restriction, relating to the need to take arbitrarily high 
moments in order to establish a central limit theorem; see for instance Figure 1 
for some numerical evidence that the central limit theorem should in fact hold for 
larger radii as well (and for real matrices as well as complex ones). It seems likely 
that one can also obtain extensions of many of the other results in [39] (or related 
papers, such as [32], [38]) on gaussian fluctuations from the circular law from the 
complex gaussian ensemble to other ensembles that match the complex gaussian 
ensemble to a sufficiently large number of moments, but we will not pursue such 
results here. We remark that for macroscopic statistics - Y^i=i F(^i/V™) with F 
fixed and analytic, such extensions (without the need for matching moments beyond 
the second moment) were already established in [40]. 

1.1. The real case and applications. There is a (more complicated) analogue 
of Theorem 2 in which the complex entries are replaced by real ones. This has 
the effect of forcing the spectrum Ai(M n ), . . . , X n (M n ) to split into some num- 
ber Ai,g(M„), . . . , A J v B r Wn j iR (M Ii ,) of real eigenvalues, together with some number 
Ai,c + (-^n)) • • • j Aj\r c [m„],C + (M n ) of complex eigenvalues in the upper half-plane 
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C+ := {z E C : Im(z) > 0}, as well as their complex conjugates Ai t c + (M n ), . . . , \b.c+ (M n ), 
where ATr [M n ] , N<c + [M n ] denote the number of real eigenvalues of M n and the 
number of eigenvalues of M n in the upper half-plane respectively (so in particu- 
lar, JVr[M„] + 2Nc + [M n ] = n almost surely). Because of this additional structure 
of the eigenvalues, it is no longer convenient to consider the correlation functions 
p [ n ] : C k -> R+ as defined in (1), since they become singular when one or more 
of the variables is real. Instead, it is more convenient to work with the correlation 
functions p£' l) : R k x -> K+, defined for k,l > by the formula 

(13) 

/ / F(xi, ...,Xk,zi,.. . ,z t )p^' l \xi, ...,Xk,zi,..., zi) dxi . ..dx k dzi ...dzi 

Jl* JC l + 

= = E E 

l<»i< — <»k<JVn[M n ] l<ji< — <ji<N c+ [M„] 

F(A iliM (M„), . . . , X ikiR (M n ), A jliC+ (M„), . . . , A ji>c+ (MJ). 

Again, the exact ordering of the eigenvalues here is unimportant. When the law 
of M n has a continuous density with respect to Lebesgue measure on real matrices 
(which is for instance the case with the real gaussian ensemble), one can interpret 
pri ,l \xi, . . . , Xk, z\, . . . , zi) for distinct xi,...,Xk € K and z\,...,zi € C + as the 
unique real number such that, as e — > 0, the probability of simultaneously having 
an eigenvalue of M n in each of the intervals (xi — e, Xi + e) for i = l,...,k and in 
each of the disks B(zj,e) for j = 1, . . . , I is equal to 

(1 + 0(1))^) On, a-:, Zl )(2e) k {iTe 2 ) 1 

in the limit as e — > 0. 



Define C_ := {z e C : Im(z) < 0} and C* := C + UC_ = C\R. We extend the 
correlation functions p^' 1 ^ from E fe x to R k x by requiring that the functions 
be invariant with respect to conjugations of any of the I coefficients of C'. We then 
extend p^ ,l) by zero from R k x C[ to R l x C ! . 

When M n is given by the real gaussian ensemble, the correlation functions pn^ 
were computed by a variety of methods, for both odd and even n, in [46], [45], 
[7], [6], [1], [30], [23] (with the (k,l) — (1,0), (0,1) cases worked out previously in 
[34], [13], [14], building in turn on the foundational work of Ginibre [26]). The pre- 
cise formulae for these correlation functions are somewhat complicated and involve 
Pfaffians of a certain 2x2 matrix kernel; see Appendix B for the formulae when n 
is even, and [45], [23] for the case when n is odd. To avoid some technical issues we 
shall restrict attention to the case when n is even, although it is virtually certain 
that the results here should also extend to the odd n case. 



For technical reasons, we will need the following variant of (6): 

Lemma 11 (Uniform bound on correlation functions). Let k,l > be fixed natural 
numbers, let n be even, and let M n be drawn from the real gaussian ensemble. Then 
for all X\ , . . . , Xk € ffi and zi,...,zi < C one has 

0<p ( £> l \x 1 ,...,x k ,z 1 ,...,Zi)<C k j 
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for some fixed C k j depending only on k,l. 



This lemma follows fairly easily from the computations in [7]; we give the details 
in Appendix B. We will need this lemma in order to control the event of having 
two real eigenvalues that are very close to each other, or a complex eigenvalue very 
close to the real axis, as in those cases, one is close to a transition in which two 
real eigenvalues become complex or vice versa, creating a potential instability in 
the correlation functions p^n^ ■ One can in fact establish stronger level repulsion 
estimates which provide some decay on pii ' (x\, . . . , Xk, z\,..., zi) as two of the 
Xi , . . . , x k , z\ , . . . , zi get close to each other, or as one of the Zi gets close to the real 
axis, but we will not need such estimates here. 

We then have the following analogue of Theorem 2, which is the second main 
result of this paper: 

Theorem 12 (Four Moment Theorem for real matrices). Let M n , M n be independent- 
entry matrix ensembles with real coefficients, obeying Condition CI, such that M n 
and M n both match moments with the real gaussian matrix ensemble to fourth or- 
der. Let k, I > be fixed integers, and let let xi, . . . ,x k and zi,...,zi € C be 
bounded. Assume that n is even. Let F : R k x C ; — > C be a smooth function which 
admits a decomposition of the form 

m 

F(yi, ...,y k ,w 1 ,...,wi) = . . . G iik {yk)Fi,i{wi) . ..F^wi) 

i=i 

for some fixed m and some smooth functions Gi. p : R — > C and Fi_j : C — > C for 
i = 1, . . . , m, p = 1, . . . , k and j = 1, . . . , I supported on the interval {j/gR: \y\ < 
C} and disk {w € C : \w\ < C} respectively, obeying the derivative bounds 

\V a Gi, p (y)\,\V a F i}j (w)\ <C 

for all < a < 5, i = 1, . . . , m, p = 1, . . . , k, j = 1, . . . , y e M, and w e C, and 
some fixed C. Let Pn' l \ Pn^ be the correlation functions for M n , M n respectively. 
Then 

F(y 1 , ...,y k ,wi,..., wi)p ( £- l) {s/nxi + y u . . . , y/nx k + y k , 

c 

\pnz\ +wi,... ,Vnzi + wt) dwi . . . dwidyi ...dy k 
= / F(y 1 ,...,y k ,w 1 ,... 1 wi)p'£' l) (y/nx 1 +y ll ... 1 ^/nx k +y k , 

ypnz x +wi,... ,y / nz l + w{) dwi . . . dwidyi ...dy k + 0(n~ c ). 

for some absolute constant c > (independent of k,l). Furthermore, the implicit 
constant in the 0(n~ c ) notation is uniform over all X\, . . . , x k and Z\, . . . , Zi in the 
bounded regions {x € R : |a;| < C} and {zeC:|z|<C} respectively. 



I / 



As will be seen in Section 6.2, the proof of Theorem 12 proceeds along the same 
lines as Theorem 2, but with some additional arguments involving Lemma 11 re- 
quired to prevent pairs of eigenvalues from escaping or entering the real axis due to 
collisions. It is because of these additional arguments that matching to fourth order, 
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Figure 2. The spectrum of a random real gaussian 10,000 x 
10, 000 matrix, with additional detail near the origin to show the 
concentration on the real axis. Thanks to Ke Wang for the data 
and figure. 




Figure 3. The spectrum of a random real Bernoulli 10,000 x 
10, 000 matrix, with additional detail near the origin. Thanks to 
Ke Wang for the data and figure. 



rather than third order, is required. It is however expected that the moment con- 
ditions should be relaxed; see for instance Figures 2, 3 for the close resemblance in 
spectral statistics between real gaussian and Bernoulli matrices, which only match 
to third order rather than to fourth order. 

Remark 13. In [45], some explicit formulae for the correlation functions of real 
gaussian matrices in the case of odd n were given, while in [23] a relationship 
between the correlation functions for odd and even n is established. In principle, 
one could use either of these two results to extend Lemma 11 to the odd n case. 
Once the odd case of Lemma 11 is obtained, Theorem 12 extends automatically to 
this case. Due to space limitations, we do not attempt to execute this calculation 
here. 
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We now turn to applications of Theorem 12. In the complex case, the asymptotics 
for complex gaussian matrices given in Lemma 6 could be extended to other in- 
dependent entry matrices using Theorem 2, yielding Corollary 7. We now develop 
some analogous results in the real gaussian case. We first recall the following result 
of Borodin and Sinclair [7] : 

Lemma 14 (Kernel asymptotics, real case). Let k, I > be fixed natural numbers, 
and let z be a fixed complex number. Assume either that k = 0, or that z is real. 
Then there is a function p^,jz ■ K fe x C' — > R + with the property that one has the 
pointwise convergence 

ptf'^iVnz+yu Vr^z+yk, y/nz+wi, . . . , ^fnz+wi) -> p£'$(yi, ...,y k ,wi,.. .,w t ) 

as n — > oo, provided that M n is drawn from the real gaussian ensemble and n is 
restricted to be even. 



Proof. See [6, Section 7] or [7, Section 8]. The limit p^'/i is explicitly computed in 
these references, although when z is real the limit is quite complicated, being given 
in terms of a Pfaffian of a moderately complicated matrix kernel involving the error 
function erf. However, when z is strictly complex the limit is the same as in the 
complex gaussian case, thus = Px,z,...,z', see [7] for further details. It is likely 
that the same asymptotic also holds for odd n, by using the explicit formulae in 
[45] or the relation between the odd and even n correlation functions given in [23]; 
if the restriction to even n is similarly dropped from Lemma 11, then Corollary 15 
below can be extended to the odd n case. However, we will not pursue this matter 
here. □ 



We can then obtain the following universality theorem for the correlation functions 
of real matrices: 

Corollary 15 (Universality for real matrices). Let M n be an independent- entry 
matrix ensemble with real coefficients obeying Condition CI, and which matches 
moments with the real gaussian matrix ensemble to fourth order. Assume n is 
even. Let k,l > be fixed natural numbers, and let z be a fixed complex number. 
Assume either that k = 0, or that z is real. Let F : R k x C' — > R + be a fixed 
continuous, compactly supported function. Then 




Pho'JiVu ■ ■ ■ ,Vk, wi, ■ ■ -,wi) 
dwi . . .dwidyi . ..dy k , 



where pao'Ji,...,x k ,zi,...,zi is as in Lemma 14- 
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Figure 4. The empirical average number of real eigenvalues of 
200 samples of real gaussian and real Bernoulli matrices of various 

sizes, plotted against y^jf. Thanks to Ke Wang for the data and 
figure. 



Proof. In the case when M n is drawn from the real gaussian ensemble, this follows 
from Lemma 14, Lemma 11, and the dominated convergence theorem. The exten- 
sion to more general independent-entry matrices then follows from Theorem 12 by 
repeating the arguments used to prove Corollary 7. □ 



As in the complex case, Theorem 12 can be used to (partially) extend various 
known facts about the distribution of the eigenvalues of a real gaussian matrices 
to other real independent entry matrices. Rather than giving an exhaustive list 
of such extensions, we illustrate this with two sample applications. Let Nn(M n ) 
denote the number of real zeroes of a random matrix M n . Thanks to earlier results 
[13, 24], we have the following asymptotics: 

Theorem 16 (Real eigenvalues of a real gaussian matrix). Let M n be drawn from 
the real gaussian ensemble. Then 

EN R (M n ) = J^ + 0(l) 



and 



/ 2 77 

VariV R (M n ) = (2 - y/2)d — + o(y/n) 



Proof. The expectation bound was established in [13], and the variance bound in 
[24]. In fact, more precise asymptotics are available for both the expectation and 
the variance; we refer the reader to these two papers [13] , [24] for further details. □ 
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By using the above universality results, we may partially extend this result to 
more general ensembles: 

Corollary 17 (Real eigenvalues of a real matrix). Let M n be an independent- entry 
matrix ensemble with real coefficients obeying Condition CI, and which matches 
moments with the real gaussian matrix ensemble to fourth order. Assume n is 
even. Then 



We prove this result in Section 6.3. 

As another quick application, we can show for many ensembles that most of the 
eigenvalues are simple: 

Corollary 18 (Most eigenvalues simple). Let M n be an independent matrix ensem- 
ble obeying Condition CI, and which matches moments with the real or complex 
gaussian matrix to fourth order. In the real case, assume n is even. Then with 
probability 1 — 0(n~ c ), at most 0(n 1_c ) of the complex eigenvalues, and 0(n 1 / 2_c ) 
of the real eigenvalues, are repeated, for some fixed c > 0. 

We establish this result in Section 6.3 also. It should in fact be the case that with 
overwhelming probability, none of the eigenvalues are repeated, but this seems to 
be beyond the reach of our methods. 

We thank Anthony Mays and the anonymous referees for corrections and help 
with the references. 



The proof of the four moment theorem for (Hermitian) Wigner ensembles in [56] is 
based on the Lindeberg exchange strategy, in which one shows that various statistics 
of ensembles are stable with respect to the swapping of one or two of the coefficients 
of that ensemble. The original argument in [56] was based on a swapping analysis 
of individual eigenvalues Aj(M„), which was somewhat complicated technically; but 
in [21], [31] it was observed that one could work instead with the simpler swapping 
analysis of resolvents 6 (or Greens functions) R(z) := (W n — z)^ 1 , particularly if one 
was mainly focused on obtaining a Four Moment Theorem for correlation functions, 

6 Herc and in the sequel we adopt the abbreviation z for the scalar multiple zl of the identity 
matrix. 




and 



VarN R (M n ) = C^n 1 " ) 
for some fixed c > 0. In particular, from Chebyshev's inequality, we have 




2. Key ideas and a sketch of the proof 
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rather than for individual eigenvalues (which in any event are not natural to work 
with in the non-Hermitian case). In all of these arguments for Wigner matrices, 
a key role was played by the local semi- circle law, which could in turn be proven 
by exploiting concentration results for the Stieltjes transform s(z) := ^ tr&ce(W n — 
z)^ 1 of a Wigner matrix. Again, we refer the reader to the preceding surveys for 
details. 

Our strategy of proof of Theorem 2 and Theorem 12 is broadly analogous to that 
in the Hermitian case, in that it relies on a four moment theorem (Theorem 25 
below) and on a local circular law (Theorem 20 below). However, this is highly 
non-trivial to execute this plan. We are going to need a number of new ideas, 
coming from different fields of mathematics, and a fair amount of delicate analysis 
using advanced sharp concentration tools. 

To start, there is an essential difference between handling non-Hermitian and 
Hermitian matrices, namely that the spectrum of a non-Hermitian matrix is highly 
unstable (see [3] for a discussion). Due to this difficulty, even the (global) circular 
law, which is the non-Hermitian analogue of Wigner semi-circle law, required several 
decades of effort to prove, and was solved completely only recently (see the surveys 
[53, 5] for further discussion). For this reason, it is no longer practical to make the 
resolvent (M n — z)^ 1 (and the closely related Stieltjes transform ^ trace(M„ — z)^ 1 ) 
the principal object of study. Instead, following the foundational works of Girko 
[27] and Brown [10], we shall focus on the log- determinant 

log|det(M„-z)| 

for a complex number parameter z. 

The log-determinant is connected to the eigenvalues of the iid matrix M n via the 
obvious identity 



In order to restrict to a local region, our idea is to use Jensen's formula. Suppose 
that / is an analytic function in a region in the complex plane which contains the 
closed disk D of radius r about the origin, a\ , , . . . , a n are the zeros of / in the 
interior of D (counting multiplicity), and /(0) ^ 0, then 



n 



(14) 




i=i 




Applied Jensen's formula to (14), we obtain 




r 
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for any ball B(zo,r) (with the convention that both sides are equal to — oo when 
z is an eigenvalue of M n ). 

From (15), we see (in principle, at least) that information on the (joint) distribu- 
tion of the log-determinants log | det(M„ — z)\ for various values of z should lead to 
information on the eigenvalues of M n , and in particular on the fc-point correlation 
functions of M n . As Jensen formula is a classical tool in complex analysis, this 
step looks quite robust and would potentially find applications in the study of local 
properties of many other random processes. 

On the other hand, we can also write the log-determinant in terms of the Hermitian 
2n x 2n random matrix 

via the easily verified identity 

(17) log|det(M„-z)| = ilog|detiy„, z | + ^nlogn. 

This observation is known as the Girko Hermitization trick, and in principle reduces 
the spectral theory of non-Hermitian matrices to the spectral theory of Hermitian 
matrices. 



The log-determinant of W n . z is in turn related to other spectral information of 
W„ )Z , such as the Stieltjes transform 7 

s Wnz (E + V^lv) ■= ^ trace ((W n , x -E- v^r?)" 1 ) 
of W UyZ , for instance via the identity 

(18) log | det W n . z | = log | det(W n , z - ^IT) \ - 2nlm f s Wn t {V^lv) *7, 

Jo 

valid for arbitrary T > 0. Thus, in principle at least, information on the distribution 
of the Stieltjes transform Sw n , z will imply information on the log-determinant of 
W ntZ , and hence on M n — z, which in turn gives information on the eigenvalue 
distribution of M n . This is the route taken, for instance, to establish the circular 
law for iid matrices; see [53, 5] for further discussion. There is a non-trivial issue 
with the possible divergence or instability of the integral in (18) near r\ = 0, but it 
is now well understood how to control this issue via a regularisation or truncation 
of this integral, provided that one has adequate bounds on the least singular value 
of W n , z ; again, see [53, 5] for further discussion. Fortunately, we and many other 
researchers have proved such bounds in previous papers, using methods from a 
seemingly unrelated area of Additive Combinatorics (see Proposition 27 below). 

There is a significant technical issue arising from the fact that formulae such as 
(18) or (15) require one to control the value of various random functions, such as 
log-determinants or Stieltjes transforms, for an uncountable number of choices of 
parameters such as z and t], so that one can no longer directly use union bound to 

7 We use \/— 1 to denote the standard imaginary unit, in order to free up the symbol i to be 
an index of summation. 
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control exceptional events when the expected control on these quantities fails. To 
overcome this, we appeal to the Monte Carlo method, frequently used in combi- 
natorics and theoretical compute science. This method enables us to use random 
sampling arguments to replace many of these integral expressions by discrete, ran- 
dom, approximations, to which the union bound can be safely applied (see Section 
5). 

The application of the Monte Carlo method (Lemma 36), on the other hand, is far 
from straightforward, since in certain situations (see Section 6), the variance is too 
high and so the bound implied by Lemma 36 becomes rather weak. We handle this 
situation by a variance reduction argument, exploiting analytical properties of the 
relevant functions. This step also looks robust and may be useful for practitioners 
of the Monte Carlo method in other fields. 

After these steps, the rest of the proof essentially boils down to error control, 
in form of a sharp concentration inequality (Theorem 33), which will be done by 
analyzing a delicate (and rather unstable) random process, using recent martingale 
inequalities and various adhoc ideas. 

Remark 19. For Hcrmitian ensembles, swapping methods (such as the Four Mo- 
ment Theorem) are not the only way to obtain universality results; there is also 
an important class of methods (such as the local relaxation flow method) that are 
based on analysing the effect of a Dyson-type Brownian motion on the spectrum of 
a random matrix ensemble; see e.g. [15]. However, there is a significant obstruction 
to adapting such methods to the non-Hermitian setting, because the equations of 
the analogue to Dyson Brownian motion either 8 couple together the eigenvectors 
and the eigenvalues in a complicated fashion, or need to be phrased in terms of a 
triangular form of the matrix, rather than a diagonal one (cf. [35]). We were unable 
to resolve these difficulties in the non-Hermitian case, and rely solely on swapping 
methods instead; unfortunately, this then requires us to place moment matching hy- 
potheses on our matrix ensembles. It seems of interest to develop further tools that 
are able to remove these moment matching hypotheses in non-Hermitian settings. 

2.1. Key propositions. The proof of Theorem 2 relies on two key facts, both of 
which may be of independent interest. The first is a "local circular law" . Given a 
subset ft of the complex plane, let 

N n = N n [M n ] := |{1 < i < n : A 4 (M„) e 0} 

denote the number of eigenvalues of M n in il. 

Theorem 20 (Local circular law). Let M n = ((,ij)i<i t j< n be an independent- entry 
matrix with independent real and imaginary parts obeying Condition CI, and which 
matches either the real or complex gaussian matrix to third order. Then for any 



One can explain this by observing that in the Hermitian case, the eigenvalues determine the 
matrix up to a U n (C) symmetry, but in the non-Hermitian case the symmetry group is now the 
much larger group GL n (C). Dyson Brownian motion is [/ n (C)-invariant, but is not GL n (C)- 
invariant, which is why this motion can be reduced to dynamics purely on eigenvalues in the 
Hcrmitian case but not in the non-Hermitian one. 
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fixed C > 0, one has with overwhelming probability that 

(19) N BM = [ -l lzl ^dz + 0(n°^r) 

JB(z„,r) K 

uniformly for all z G B(0, Cy/n) and all r > 1. In particular, we have 

(20) N BM < n°^r 2 

with overwhelming probability, uniformly for all z € 5(0, Cy/n) and all r > 1. 

Remark 21. The bound (19) is probably not best possible, even if one ignores 
the term. In the complex gaussian case, it has been shown [39] that the 

variance of N B ( Zar ) is actually of order r, suggesting a fluctuation of 0(n°' 1 V 1 / 2 ) 
rather than 0(n°^r); the closely related results in Theorem 9 and Corollary 10 also 
support this prediction. Also notice that we assume only three matching moments 
in this theorem, so the statement applies for instance to random sign matrices 
(which match the real gaussian ensemble to third order). For our applications to 
Theorems 2, 12, we do not need the full strength (19) of the above theorem; the 
weaker bound (20) will suffice. 

Remark 22. Very recently, Bourgade, Yau, and Yin [8] have established a variant 
of Theorem 20 (and also Theorem 25) which does not require matching to third 
order, but with the disk B(zo,r) assumed to lie a distance at least Sy/n from the 
circle {|z| = yfn} for some fixed e > 0. By using the main result of [8] as a substitute 
for Theorem 20 (and also Theorem 25), we may similarly remove the third order 
matching hypotheses from Theorem 2, at least in the case when z±, . . . , z k stay a 
distance Sy/n from the circle {\z\ = y/n}. Since the initial release of this paper, an 
alternate proof of Theorem 20 (in the case when one matches the complex gaussian 
ensemble to third order, as opposed to the real gaussian ensemble) which works 
both in the bulk and in the edge was given in [9]. 

The second key fact is a "Four Moment Theorem" for the log-determinants log | det(M„— 
z)\: 

Theorem 23 (Four Moment Theorem for determinants). Let c > be a suffi- 
ciently small absolute constant. Let M n ,M' n be two independent random matrices 
with independent real and imaginary parts obeying Condition CI, which match each 
other to fourth order, and which both match the real gaussian matrix ( or both match 
the complex gaussian matrix) to third order. Let 1 < fc < n c " , let C > be fixed, 
and let Zi, . . . , Zk € B(0, Cy/n). Let G : R h — > C be a smooth function obeying the 
derivative bounds 

\V^G( Xl ,...,x k )\ «n C0 

for all j = 0, . . . , 5 and x\,...,Xk € M, where V denotes the gradient in R k . Then 
we have 

EG(log | det(M n - Zl )\, . . . ,log | det(M„ - z k )\) 

= EG(log | det(M' n ~ Zl )\, . . . , log | det(M; - z k )\) + 0( n - c °), 



See Section 3 for a definition of this term, and for the definition of asymptotic notation such 
as o(l) and <C- 
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with the convention that the expression G(log | det(M„ — z\)\, . . . , log | det(M„ — 
Zk)\) vanishes if one of the Z\, . . . , is an eigenvalue of M n , and similarly for the 
expression G(log | det(M ? ' l — z\)\, . . . , log | Aet{M' n — Zk)\)- 

The proof of Theorem 2 follows fairly easily from Theorem 20 (in fact, we will only 
need the weaker conclusion (20)) and Theorem 23 (and (10)), using the well-known 
connection between spectral statistics and the log-determinant which goes back to 
the work of Girko [27] and Brown [10], and which was mentioned previously in 
this introduction; we give this implication in Section 6. A slightly more sophisti- 
cated version of the same argument also works to give Theorem 12; we give this 
implication in Section 6.2. 

It remains to establish the local circular law (Theorem 20) and the four moment 
theorem for log-determinants (Theorem 23) . The key lemma in the establishment of 
the local circular law is the following concentration result for the log-determinant. 

Definition 24 (Concentration). Let n > 1 be a large parameter, and let X be 
a real or complex random variable depending on n. We say that X concentrates 
around M for some deterministic scalar M (depending on n) if one has 

X = M + O(n o(1) ) 

with overwhelming probability. Equivalently, for every e, A > independent of n, 
one has X = M + 0(n e ) outside of an event of probability 0(n~ A ). We say that 
X concentrates if it concentrates around some M. 

Theorem 25 (Concentration bound on log-determinant). Let M n — (&j)i<i.j<n be 
an independent- entry matrix obeying Condition Cland matching the real or complex 
gaussian ensemble to third order. Then for any fixed C > 0, and any z n e B(0, C), 
log | det(M„ — z y/n)\ concentrates around |nlogn+ |n(|zo| 2 — 1) f or \ z a\ < 1 an d 
around ^nlogn + nlog |zo| f or \ z o\ > 1? uniformly in z . 

Remark 26. The reason we require only three moments in this theorem instead 
of four (as in the previous theorem) is that in this theorem the error in Definition 
24 is allowed to be a positive power of n while in the previous one it needs to be 
a negative power. We remark that this theorem is consistent with (14) and the 
circular law; indeed, the quantity J B , Q ^ ^ log \z — z n \ dz can be computed to be 

equal to jd-Zol 2 — 1) when |zo| < 1 and log \zo\ when |zo| > 1- As in Remark 22, a 
variant of Theorem 25 without the third order hypothesis, but requiring zq bounded 
away from the circle {|z| = 1}, was recently given in [8]. 

We give the derivation of Theorem 20 from Theorem 25 in Section 5. The main 
tools are Jensen's formula (15) and a random sampling argument to approximate 
the integral in (15) by a Monte Carlo type sum, which can then be estimated by 
Theorem 25. 

It remains to establish Theorem 23 and Theorem 25. For both of these theorems, 
we will work with the Hcrmitian matrix W n<z defined in (16), taking advantage of 
the identity (17). In order to manipulate quantities such as the log-determinant of 
W n , z efficiently, we will need some basic estimates on the spectrum of this operator 
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(as well as on related objects, such as resolvent coefficients). We first need a lower 
bound on the least singular value that is already in the literature: 

Proposition 27 (Least singular value). Let M n be an independent- entry matrix 
ensemble with independent real and imaginary parts, obeying Condition CI, and 
let z <G B(0, Cy/n) for some fixed C > 0. Then with overwhelming probability, one 
has 

inf \\i(W niZ )\ >n" lo s". 

l<i<n 

Furthermore, for any fixed Co > one has 

P( inf \\i{W n , z )\ < n-^ 2 - C0 ) « n- ca ' 2 . 

l<i<n 

The bounds in the tail probability are uniform in z . 

Proof. Note from (16) that infi<i<„ |A;(VF„ iZ )| is the least singular value of -L=(M„ — 
z). The first bound then follows from [52, Theorem 2.5] (and can also be deduced 
from the second bound). The lower bound n~ log ™ can be improved to any bound 
decaying faster than a polynomial, but for our applications any lower bound of the 
form exp(— n ' 1 )) will suffice. The second bound follows from [55, Theorem 3.2] 
(and can also be essentially derived from the results in [42], after adapting those 
results to the case of random matrices whose entries are uncentered (i.e. can have 
non-zero mean)). We remark that in the zq case, significantly sharper bounds can 
be obtained; see [42] for details. □ 

Remark 28. The proof of this bound relies heavily on the so-called inverse Littlewood- 
Offord theory introduced by the authors in [51], which was motivated by Additive 
Combinatorics (see [50, Chapter 7]), a seemingly unrelated area. Interestingly, this 
is, at this point, the only way to obtain good lower bound on the least singular 
values of random matrices when the ensemble is discrete (see also [42, 43, 53] for 
more results and discussion). 

Next, we establish some bounds on the counting function 

JV> := |{1 < i < n : Ai(W n , z ) € J}|, 

and on coefficients i?(v / ~ °f the resolvents R{\f^\vi) := {W n ^ z — y 7 — ^v) 1 on 
the imaginary axis. 

Proposition 29 (Crude upper bound on Nj). Let M n be an independent- entry 
matrix ensemble with independent real and imaginary parts, obeying Condition CI. 
Let C > be fixed, and let z e B(Q,Cy/n). Then with overwhelming probability, 
one has 

iV/<n o(1) (l + n|/|) 

for all intervals L. The bounds in the tail probability (and in the o(l) exponent) are 
uniform in z . 

Remark 30. It is likely that one can strengthen Proposition 29 to a "local distorted 
quarter-circular law" that gives more accurate upper and lower bounds on Nj, 
analogous to the local semi-circular law from [17], [18], [19] (or, for that matter, 
the local circular law given by Theorem 20). However, we will not need such 
improvements here. 
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Proposition 31 (Resolvent bounds). Let M n be an independent- entry matrix en- 
semble with independent real and imaginary parts, obeying Condition CI. Let 
C > be fixed, and let z € B(0, C\fn). Then with overwhelming probability, 
one has 



for all rj > and all 1 < i, j < n. 

Remark 32. One can also establish similar bounds on the resolvent (as well as 
closely related delocalization bounds on eigenvectors) for more general spectral 
parameters E + \^—lrj. However, in our application we will only need the resolvent 
bounds in the E = case. 

Propositions 29 and 31 are proven by standard Stieltjes transform techniques, 
based on analysis of the self-consistent equation of W„ i2 as studied for instance 
by Bai [3], combined with concentration of measure results on quadratic forms. 
The arguments are well established in the literature; indeed, the z — case of 
these theorems essentially appeared in [57], [21], while the analogous estimates for 
Wigner matrices appeared in [17], [18], [19], [56]. As the proofs of these results are 
fairly routine modifications of existing arguments in the literature, we will place 
the proof of these propositions in Appendix A. We remark that in the very recent 
paper [8], some stronger eigenvalue rigidity estimates for W n , z are obtained (at 
least for z staying away from the unit circle {\z\ = 1}), which among other things 
allows one to prove variants of Theorem 25 and Theorem 20 without the moment 
matching hypothesis, and without the need to study the gaussian case separately 
(see Theorem 33 below). 

One can use Propositions 27, 29, 31 to regularise the log-determinant of W n . z , 
and then show that this log-determinant is quite stable with respect to swapping 
(real and imaginary parts of) individual entries of the M„ )Z , so long as one keeps 
the matching moments assumption. In particular, one can now establish Theorem 
23 without much difficulty, using standard resolvent perturbation arguments; sec 
Section 8. A similar argument, which we give in Section 10, reduces Theorem 25 
to the gaussian case. Thus, after all these works, the remaining task is to prove 

Theorem 33. Theorem 25 holds when M n is drawn from the real or complex 
gaussian ensemble. 

We prove this theorem in Section 9. This section is the most technically involved 
part of the paper. The starting point is to use an idea from our previous paper [60], 
which studied the limiting distribution of the log-determinant of a shifted GUE 
matrix. In that paper, the first step was to conjugate the GUE matrix into the 
Trotter tridiagonal form [62], so that the log-determinant could be computed in 
terms of the solution to a certain linear stochastic difference equation. In the case 
in this paper, the analogue of the Trotter tridiagonal form is a Hessenbcrg matrix 
form (that is, a matrix form which vanishes above the upper diagonal), which (after 
some linear algebraic transformations) can be used to express the log-determinant 
log | det(M„ — z ay /n)\ in terms of the solution to a certain nonlinear stochastic 
difference equation. This Hessenberg form of the complex gaussian ensemble was 
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introduced in [33], although the difference equation we derive is different from 
the one used in that paper. To obtain the desired level of concentration in the 
log-determinant, the main difficulty is then to satisfactorily control the interplay 
between the diffusive components of this stochastic difference equation, and the 
stable and unstable equilibria of the nonlinearity, and in particular to show that the 
deviation of the solution from the stable equilibrium behaves like a martingale. This 
then allows us to deduce the desired concentration from a martingale concentration 
result (see Proposition 35 below). 



Throughout this paper, n is a natural number parameter going off to infinity. A 
quantity is said to be fixed if it does not depend on n. We write X = 0(Y), 
Y = Q{X), or Y > X if one has \X\ < CY for some fixed C, and X = o(Y) if 
one has X/Y — > as n — > oo. Absolute constants such as Co or c are always 
understood to be fixed. 

We say that an event E occurs with overwhelming probability if it occurs with 
probability f — 0(n~ A ) for all fixed A > 0. We use 1_e to denote the indicator of 
E, thus \e equals 1 when E is true and when E is false. We also write In (or) for 



As we will be using two-dimensional integration on the complex plane C :— {z — 
x + y/—ly : x, y <G M} far more often than we will be using contour integration, we 
use dz — dxdy to denote Lebesgue measure on the complex numbers, rather than 



We use N([i, } <j 2 )r to denote a real gaussian distribution of mean \x and variance 
er 2 , so that the probability distribution is given by ^=? e ^ ^ ^ a dx. Similarly, 
we let N([i 7 <j 2 )c denote the complex gaussian distribution of [i and variance a 2 , 
so that the probability distribution is given by -^e-^-^ l a dz. Of course, the 
two distributions are closely related: the real and imaginary parts of N(n, a 2 )c are 
independent copies of iV(Re/z, <t 2 /2)r and iV(im/z, ct 2 /2)k respectively. In a similar 
spirit, for any natural number, we use Xi,R to denote the real x distribution with 
i degrees of freedom, thus XiM = + ' ' ' + f° r independent copies . . 
of N(0, I)r. Similarly, we use Xi,C to denote the complex x distribution with i 
degrees of freedom, thus Xi,C = \/£ \ + ■ ■ ■ + Q for independent copies £i, . . . ,£» of 
A(0, l)c- Again, the two distributions are closely related: one has Xi,C = ^X2i,K 
for all i. 

If F : C k — > C is a smooth function, we use VF(z\, ... ,Zk) to denote the 2k- 
dimensional vector whose components are the partial derivatives g ^ z . {z\, ...,Zk), 
dimz • • • ' Zfe ) t° r * = 1, • • • , fc- Iterating this, we can define V a F(zi, . . . , Zk) 
for any natural number a as a tensor with (2k) a coefficients, each of which is an 
a- fold partial derivative of F at Z\, . . . , z k . The magnitude |V a F(z 1; . . . , z k ) \ is then 



3. Notation 
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defined as the £ 2 norm of these coefficients. Similar iy for functions defined on M. k 
instead of C fe . 



4. A CONCENTRATION INEQUALITY 

In this section we recall a martingale type concentration inequality which will be 
useful in our arguments. Let Y — Y(£i, . . . ,£„) be a random variable depending 
on independent atom variables & e C. For 1 < i < n and £ = (£1, . . . ,£„) e C™, 
define the martingale differences 

(21) Ci(0 := |E(F|a, ... ,6) - E(r|Ci, . . . ,&-i)|. 



The classical Azuma's inequality (see e.g. [2]) states that if Q < with proba- 
bility one, then 



P IV- EY~| > A 



V>? = O( xp(-S)(A 2 ))). 



1=1 



In applications, the assumption that C\ < on with probability one sometimes fails. 
However, we can overcome this using a trick from [63]. In particular, the following 
is a simple variant of [63, Lemma 3.1]. 

Proposition 34. For any on > we have the inequality 



P \\Y -EY\ > A. 



,E fl . 2 h 0(ex P (-0(A 2 ))) + W(0 > 



i=l 



Proof. For each £, let be the first index where Cj(£) > c^. Thus, the sets 
: = = *} arc disjoint. Define a function Y'(£) of £ which agrees with Y(£) 
for £ in the complement of Uj-Bj, with := E^F if £ € B^. It is clear that 1" 

and Y has the same mean and 



p(y^y')<£P(Ci(0>ai)- 

i=l 

Moreover, V satisfies the condition of Azuma's inequality, so 

|5 "- Ey ' liA 

and the bound follows. 



,E«?| «ex P (^(A 2 )) 



□ 



We have the following useful corollary. 
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Proposition 35 (Martingale concentration). Let^i, . . . ,£„ be independent complex 
random variables of mean zero and = with overwhelming probability for 

all i. Let <%i, . . . , a n > be positive real numbers, and for each i = 1, . . . ,n, let 
Cj(£i, . . . be a complex random variable depending only on£i, . . . obeying 

the bound 

with overwhelming probability. Define Y :— c i(£i: • ■ ■ ^/ien 

|y|«« o(1) (E«-) 1/2 

w«i/i overwhelming probability. 

Proof. Let Cj(£) be the martingale difference (21). ft is easy to see that Cj(£) = 
l c i(£i> ■ • • By the assumptions, Cj(£) < n 0< - 1 - ) Q;i with overwhelming proba- 

bility. Now apply Proposition 34 with a suitable choice of parameter A = n°^\ □ 

5. From log-determinant concentration to the local circular law 

In this section we prove Theorem 20 using Theorem 25. The first step is to deduce 
the crude bound (20) from Theorem 25. We first make some basic reductions. By a 
covering argument and the union bound it suffices to establish the claim for r = 1 
and for a fixed zq € B(0, 2C\fn). 

The main tool will be Jensen's formula (15). Applying this to the disk B(z 0} 2), 
we see in particular that 

1 /-27T 

(22) N B(X0>1) « — (log | det(M„ -z - 2e^ Te )\ - log | det(M„ - z )\) d9. 
Let A > 1 be an arbitrary fixed quantity. In view of (22), it suffices to show that 

1 /-2w 

— / (log | det(M n - 20 - 2e^ T0 )| - log | det(M n - z )\) d6 = 0(n £ ) 
2tt Jo 

with probability 1 — 0{n~ A ). 

We will control this integral 10 by a Monte Carlo sum, using the following standard 
sampling lemma: 

Lemma 36 (Monte Carlo sampling lemma). Let (X,fi) be a probability space, and 
let F : X — > C be a square-integrable function. Let m > 1, let x\, . . . , x m be drawn 
independently at random from X with distribution [i, and let S be the empirical 
average 

S~ ±(F(x 1 ) + --- + F(x m )). 



One can also control this integral by a Riemann sum, using an argument similar to that used 
to prove Theorem 20 below. On the other hand, we will use Lemma 36 again in Section 6, and 
one can view the arguments below as a simplified warmup for the more complicated arguments in 
that section. 
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Then S has mean J x F d/i and variance J X (F — J x F dfi) 2 dfi. In particular, by 
Chebyshev's inequality, one has 

PQS - [ F dn\ > A) < [ (F - ( F dy) 2 dfi 

Jx rnX 2 J x J x 

for any X > 0, or equivalently, for any S > one has with probability at least 1 — 6 
that 

\S- [ F dn\< -X=( [ (F- f F dtf d^ 2 . 
Jx vmd Jx Jx 



Proof. The random variables F(xi) for i — 1, . . . ,m are jointly independent with 
mean J x F d/i and variance — J x (F — J x F dfi) 2 dfi. Averaging these variables, 
we obtain the claim. □ 



We apply this lemma to the probability space X := [0, 2ir] with uniform measure 
t^t d9, and to the function 

F(6) := log | det(M„ - z - 2e^ Te )\ - log | det(M„ - z )\. 

Observe that for any complex number z, the function log \z — 2e^ Te | has an L 2 {X) 
norm of 0(1). Thus by the triangle inequality and (14) we have the crude bound 

f (F — f F dfi) 2 d^<n 2 . 
Jx Jx 

We set S := n~ A and m := n A+2 . Let Oi, . . . , 9 m be drawn independently uniformly 
at random from X (and independently of M n ) and set 6 := . . . , 6 m ). Let £\ 
denote the event that the inequality 

\S [ Fdn\<^L(f (F- f F dtf d^) 1 ' 2 
Jx ymo Jx Jx 

holds, and let £2 denote the event that the inequality 

log I det(M - z - 2e^ =le * )\ - log | det(M„ - z )\ < n £ 

holds for all j = 1, . . . ,m. Call a pair (M, 6) is good if £\ and £2 both hold. It 
suffices to show that the probability that a pair (M, 0) (with M = M n ) is good is 
l-0(n- A ). 

By Lemma 36, for each fixed M, the probability that £\ fails is at most 8 — 
n~ A . Moreover, by Theorem 25, we see that for each fixed Oi, the probability that 

log I det(M-z -2e v ^ T ^)| -log | det(M„-z )| < n e fails is less than 0(n- 2A - 2 ). 
Thus, by the union bound, the probability that (M, 9) is not good (over the product 
space M n x X m ) is at most 



n- A + m x 0(n- 2A - 2 ) = 0( n - A ), 
concluding the proof of (20). 

Now we are ready to prove Theorem 20. We assume r > 10 as the claim follows 
trivially from Theorem 25 otherwise. Consider the circle C Zo>r := {z e C : \z — z \ = 
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r}. By the pigeonhole principle, there is some < j < n such that the ^3- 
neighborhood of the circle Cj := C Zo ^ rj with rj :— r — contains no eigenvalues 
of M n (notice that these neighborhoods are disjoint). If j is such an index, we see 
from (14) that the function 

F{9) := log I det(M„ - z - r^-^ 16 ^ - log | det(M„ - z )\ 

then has a Lipschitz norm of 0(n°^) on [0, 2tt]. Setting m := n A+2 for a suf- 
ficiently large constant A, we then see from quadrature that the Riemann sum 
m ^2k=i F(2irk/m) approximates the integral ^ f Q 27r F(9)d9 within an additive er- 
ror at most 7J°W. By (15), we conclude that 

IA.^oK,, ' A ^^l m k=l 

On the other hand, from Theorem 25 (after applying rescaling by y/n) and the 
union bound we see that with overwhelming probability, we have 

F(k/m) = G(z + r e^ 2 * k ' m ) - G(z ) + 0(n ^) 
for all 1 < k < m, where G(z) is defined as \{\z\ 2 — n) for \z\ < ^/n, and nlog -yL 



for \z\ > y/n. Applying quadrature again, we conclude (for A large enough) that 
G ( z o) = - V log — r -i—r + ±- G(z + r^ e ) d0 + 0(n°W). 



\\i-z \<rj 



A similar argument (replacing r by r— 1) shows that with overwhelming probability, 
there exists < j' < n such that 

G(zo) = - J2 ^Yr^l+Y- G(z + (r r -l)e^)de + O(n^). 

Also, from (20) and a simple covering argument, we know that with overwhelming 
probability, there are at most 0(n°^V) eigenvalues in the annular region between 

r T 1 — 1 

CzQ.Tjt-i and C ZOjr , and in this region, the quantities log \\.L ZQ \ an< ^ |a J -z | nave 
magnitude 0(l/r). We may thus subtract the above two estimates and conclude 
that 

= -N(z Q , r) log -p— + ±- G(z + rje^ 19 ) d9 
(23) ^ 2nJo 

- P [ * G(z + (ry - l)e^ e ) d9 + 0(n°«). 
<™ Jo 

On the other hand, from applying Green's theorem 11 

/ F(z)AG(z) - AG(z)F(z) dz = [ F{z)®-G{z)-®-F{z)G{z) 
Jn Jon on dn 



The function G has a mild singularity on the circle \z\ = y/n, but one can verify that as the 
first derivatives of G remain continuous across this circle, there is no difficulty in applying Green's 
theorem even when B(zo,rj) crosses this circle. 
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to the domain fi := B(z , rj)\B(z , e) with F(z) := log \ z j! Zo \ , and then sending 
e — > 0, one sees that 

G(z ) = -^ / AG^log,-^-, dz+-!- Gizo + r.e^ 18 ) d6, 

where A is the usual Laplacian on C; one easily computes that AG(z) = 21| z | <v ^, 
and thus 

G(z ) = -^ / l M< ^logj-^—dz+±- Gizo + r.e^de. 

K JB(z ,rj) l z - z 0\ 27T J 

Similarly one has 

G(z ) = -- f ! l 0g r^zl dz+ l. G{ Zo + {r r -l)e^~ ld )d6. 

n ./B(z ,ry-i) |z-z | 2tt 7 

Subtracting, and observing that the integrands l^^^log j^^y , !|z|<VH lo S j^r^j 
have magnitude 0(l/r) in the annular region between C ZOir .,-i and C ZQtr , we con- 
clude that 



= - / -1 N<V h ds x log -^i- + / G(z + r 3 e^ e ) d9 

JB(z Q .r) I" r j - 1 ^TT JO 

- G(z + {r r - l)e^~ w ) d9 + 0(n°W). 

27T Jo 

Comparing this with (23), we conclude with overwhelming probability that 
(n BM j \ z \<^ dz) x log = 0(n°W). 

\ JB(z a .r) K J r } 1 

Since log is comparable to 1/r, we obtain (19) as desired. 



6. Reduction to the Four Moment Theorem and log-determinant 

concentration 



We now begin the task of proving Theorem 2 and Theorem 12, by reducing it the 
Four Moment Theorem for determinants (Theorem 23) and the local circular law 
(Proposition 20). In the preceding section, of course, the local circular law has been 
reduced in turn to the concentration of the log-determinant (Theorem 25). 



6.1. The complex case. We begin with Theorem 2, deferring the slightly more 
complicated argument for Theorem 12 to the end of this section. 

Let M ni M n be as in Theorem 2. Call a statistic S(M n ) of (the law of) a random 
matrix M n asymptotically (M n , M n ) insensitive, or insensitive for short, if we have 

S(M n ) - S(M n ) = 0(n- c ) 

for some fixed c > 0. Our objective is then to show that the statistic 

(24) / F(w 1 ,...,w k )p^(Vnz 1 + wi, . . . , \Jnz k + Wk) dwi . . . dw k 

JC 
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is insensitive for all fixed k > 1 and all F of the form (8) for some fixed m > 1. 

Fix k; we may assume inductively that the claim has already been proven for all 
smaller k. By linearity we may take m = 1 , thus we may assume that F takes the 
tensor product form 

(25) F(w 1 ,...,w k ) = F 1 (w 1 )...F k (w k ) 

for some smooth, compactly supported . . . ,F k : C — > C supported on a fixed 
ball, with bounds on derivatives up to second order. 

Henceforth we assume that F is in tensor product form (25). By (1) and the 
inclusion-exclusion formula, we may thus write (24) in this case as 

k 

(26) Ej]^,^ 

i=i 

plus a fixed finite number of lower order terms that are of the form (24) for a smaller 
value of k (and a different choice of Fj), where X Z ^ F . is the linear statistic 

n 
i=l 

By the induction hypothesis, it thus suffices to show that the expression (26) is 
insensitive. 

Using the local circular law (Proposition 20), we see that for any 1 < j < k, one 
has X Zjt p j = 0(n ^) with overwhelming probability. Thus, one can truncate the 
product function Ci > • • • > Cfc ^ Ci • • • Cfc an d write 

k 

E II X *ifi = EG (^^ . • • • . X ^F k ) + 0(n- B ) 

for any fixed B, where G is a smooth truncation of the product function Ci> ■ ■ ■ > Cfc 
Ci . . . (k to the region Cii ■ • ■ ) Cfc — ■ Thus, it suffices to show that the quantity 

(27) EG(X Zl . Fl ,...,X Zk . Fk ) 

is insensitive whenever G : C k — > C is a smooth function obeying the bounds 

(28) |V J "G(Ci,...,Cfc)l<n° (1) 
for all fixed j and all Ci, • • • , Cfc € C. 

Fix G. As is standard in the spectral theory of random non-Hcrmitian matrices 
(cf. [27], [10]), we now express the linear statistics X z p in terms of the log- 
determinant (14). By Green's theorem we have 

(29) X ZjiFj = [ log | det(M„ - z)\Hj(z) dz 

Jc 

where Hj : C — > C is the function 

Hj(z) := —^-AFj(z — <jnzj), 
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and A is the Laplacian on C. From the derivative and support bounds on Fj, we 
see that Hj is supported on B(y/nzj,C) and is bounded. 

Naively, to control (29), one would apply Lemma 36 with the function log | det(M„— 
z)\Hj(z). Unfortunately, the variance of this expression is too large, due to the 
contributions of the eigenvalues far away from y/nzj. To cancel 12 off these contri- 
butions, we exploit the fact that Hj(z), being the Laplacian of a smooth compactly 
supported function, is orthogonal to all harmonic functions, and in particular to all 
(real-)linear functions: 



Ic 



(a + bRe(z) + clm(z))Hj(z) dz = 0. 



(Recall that we use dz to denote Lebesgue measure on C.) We will need a reference 
element Wjfi drawn uniformly at random from B(y/nZj,l) (independently of M n 
and the Wjj), and let L(z) = Lj(z) denote the random linear function which equals 
log | det(M„ — z)\ for z = Wj_ , Wjfi + 1, Wjfi + y/—i- More explicitly, one has 

L(z) := log|det(M n -w ji0 )| 

(30) + (log I det(M„ - Wj,o — 1)| — log | dct(M„ - Wj . )\)Rc(z - w jfi ) 

+ (log | det(M„ - wjfi - v 73 !)! - log | det(M„ - w^ )|)Im(z - w jfi ). 

Remark 37. There is some freedom in how to select L(z): for instance, it is ar- 
guably more natural to replace the coefficients log | det(M„— Wjfi— f )|— log | det(M„ — 
Wjfi)\ and log | det(M„ — Wjfi — — log | det(M„ — Wjfi)\ in the above formula 

by the Taylor coefficients ^ log | det(M„ — Wjfi — t) \ | t=0 and log | det(M„ — Wjfi — 
\f— Ti)||t=o instead. However this would require extending the four moment theo- 
rem for log-determinants to derivatives of log-determinants, which can be done but 
will not be pursued here. 

Subtracting off L(z), we have 

(31) X XjiFj = f Kj(z) dz 

where Kj : C — > C is the random function 

(32) Kj{z) := (log|det(M„ - z)\ - L{z))Hj{z). 
Let us control the L 2 norm 

1/2 

\Kj\\ L , :=( / \K 3 (z)\ 2 dz' 



c 

of this quantity. 

Lemma 38. For any e > 0, one has 

(33) ||^|| L 2 «n £+0 < 1 ) 

with probability 1 — 0(n~ £ ) and all 1 < j < k. 



12 It is natural to expect that these non-local contributions can be canceled, since the statistics 
X z . t p. are clearly local in nature. 
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Proof. By the union bound, it suffices to prove the claim for a single k. We can 
split Kj = Yh=i K jA z )> where 

K jti (z) := (log|A i (M„) -z\- L t (z))H,(z) 

and Li : C — > C is the random linear function that equals log |A,(M„) — z| when 
z = w j,o, w j.o + 1; w j,o + v/—T- By the triangle inequality, we thus have 

n 

\\Kj\\ L 2 < ^ \\ K j,i\\L2- 
i=l 

Thanks to Proposition 20, we know with overwhelming probability that one has 

(34) N B(z .^ r) « n°«r 2 

for all r. Let us condition on the event that this holds, and then freeze M n (so that 
the only remaining source of randomness is Wj.o)- In particular, the eigenvalues 
Xi(M n ) are now deterministic. 

Let C > 1 be such that Hj is supported in B{z y/n 1 C ). If 1 < i < n is such 
that Xi(M n ) G B(zjy / n,2Co), then a short computation (based on the squarc- 
integrability of the logarithm function) shows that the expected value of HiC^H^ 
(averaged over all choices of Wj.o) is 0(1). On the other hand, if Xi(M n ) £ 
B(zjy/n, 2Co), then the second derivatives of log \Xi(M n )—z\ has size 0(l/|Aj(M„)— 
Zjy/n\ 2 ) on B(zj^/n, 2Co). From this and Taylor expansion, one sees that the func- 
tion log|Ai(M„) — z\ — Li(z) has magnitude 0(l/\\i(M n ) — Zj^/n\ 2 ) on this ball, 
and so \\Kjj \\^2 has this size as well. Summing, we conclude that the (conditional) 
expected value of H-ft^H^ is at most 

n 1 

(35) << 5twTw?- 

We claim that the summation in (35) has magnitude 0(n ^) with overwhelming 
probability, which will give the claim from Markov's inequality. To see this, first ob- 
serve that the eigenvalues Xi(M n ) with \Xi(M n ) — Zjy/n\ > ^/n certainly contribute 
at most O(l) in total to the above sum. Next, from (34) we see that with overwhelm- 
ing probability that there are only 0(n ^) eigenvalues with \Xi(M n ) — Zj^/n\ < 1, 
giving another contribution of 0(n o ^) to the above sum. Similarly, for any 2 k 
between 1 and y/n, another application of (34) reveals that the eigenvalues with 
2 fe < lA^Mn) — Zjy/n\ < 2 k+1 contribute another term of <3(n°W) to the above sum 
with overwhelming probability. As there are only 0{\ogy/n) = 0(n°W) possible 
choices for k, the claim then follows by summing all the contributions estimated 
above. □ 



Now let e > be a sufficiently small fixed constant that will be chosen later. 
Set m := L^ 10e Jj and for each 1 < j < k let Wj.i, ■ ■ ■ , Wj,m be drawn uniformly 
at random from B{yjnzj, Co) (independently of M n and Wj^). By (33), (31), and 
Lemma 36, we see that with probability 1 — 0(n~ e ), one has 

~n 2 m 

^^-^ L E^K-,) + °(^ 3£+ ° (1) )- 
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In particular, from (28) we see that with probability 1 — 0(n e ), one has 
G(X ZuFll . . .,X Zk , Fk ) = G I ( ^ f; ifjKi)) I + 0(n- 3£ +°«) 

\ V i=l / l<j<k/ 

and hence 

EG(X 2l , FlJ . . . ,X ZkiFk ) = EG I r-^-^K^M ) + Ofn"^ 1 )). 

\ V *=1 / l<j<k) 

Thus, to show that (27) is insensitive, it suffices to show that 

is insensitive, uniformly for all deterministic choices of Wjfi G B(y/nzj, 1) and € 
B(y/nZj,Co) for 1 < j < fc and 1 < i < m. But this follows from the Four 
Moment Theorem (Theorem 23), if e is small enough; indeed, once the Wj^^Wj^ 
are conditioned to be deterministic, we see from (32), (30) that the quantities 
Kj(wj t i) can be expressed as deterministic linear combinations of a bouned number 
of log-determinants log|det(M„ — z)\, with coefficients uniformly bounded in n 
(recall that Wj t i — Wj^ — 0(Cq) and that the Hj are uniformly bounded). This 
concludes the derivation of Theorem 2 from Theorem 23 and Proposition 20. 




6.2. The real case. We now turn to the proof of Theorem 12. Let M n be as in 
Theorem 12, and let M n be a real gaussian matrix. Our task is to show that that 
the quantity 

/ / F(y 1 ,...,y k ,w 1 ,...,wi) 

Pn^iVnxx +y 1 ,..., y /nx k +y k ,y/nz 1 +w 1 ,...,^/nz l +w t ) 
dwi . . . dwidyi ...dy k 

is insensitive whenever k, I > are fixed, x\,. . . ,x k € E and z\,...,zi G C are 
bounded, and F decomposes as in Theorem 12. 

By induction on k + I, much as in the complex case, and separating the spectrum 
into contributions from M,C + ,C_, it thus suffices to show that the quantity 

k i i' 

(37) I 'II V , ; If Y ( . ; If A . <. . ) 

i=l j=l j'=l 

is insensitive, where k, I, I' are fixed, xi, . . . ,x k e M and z\, . . . , zi, z[, . . . , z' v G C 
are bounded, 

X x ,f,r~ F(Xi(M n ) - y/nx) 

l<i<n:Ai(M n )£R 

and 

X z ,g,c±-= J2 G(Aj(M n ) - y/nz), 

l<i<n:A I (M„)eC± 
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and the Fi : R — > C, G 3 ; : C — > C, G^, : C — > C arc smooth functions supported on 
bounded sets obeying the bounds 

|V a F l ( 2 ;)|,|V a G J (z)|,|V a G;.,(z)| <C 

for all < a < 5, x G R, z G C. Indeed, one can express any statistic of the form 
(36) as a linear combination of a bounded number of statistics of the form (37), 
plus a bounded number of additional statistics of the form (36) with smaller values 
ofk + l. 

As the spectrum is symmetric around the real axis, one has 

X Z ,G,C- = X z,G,C+ 

where G(z) := G(z). Thus we may concatenate the Gj with the G'-,, and assume 
without loss of generality that V = 0, thus we arc now seeking to establish the 
insensitivity of 

k i 

(38) KiJf.V, , /; . V. , ; , c+ ). 

i=l j=l 

On the other hand, by repeating the remainder of the arguments for the complex 
case with essentially no changes, we can show that the quantity 

m 

(39) EflV, 

P =i 

is insensitive for any fixed m, any bounded complex numbers z\, . . . , z m , and any 
smooth 7J p : C — > C supported in a bounded set and obeying the bounds 

|V a i/ p (z)| < G 

for all < a < 5 and zeC, where 

X ZiH := J2 H(Xi(M n )-z). 

l<i<n 

Thus the remaining task is to deduce the insensitivity of (38) from the insensitivity 
of (39). 

Specialising (39) to the case when z p = z is independent of p, and H p = H is 
real-valued, we see that 

is insensitive for any to. In particular, we see from (the smooth version of) Urysohn's 
lemma and Lemma 11 that we have the bound 

(40) VNb^c) « 1 

for any fixed radius G and any bounded complex number z, where — No,[M n ] 
denotes the number of eigenvalues of M n in fi. Among other things, this implies 
that 

(41) E\X x ^ R \ A ,E\X Vj . G ^ c+ \ A ^l 
for any fixed A and all i, j. 
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To proceed further, we need a level repulsion result. 

Lemma 39 (Weak level repulsion). Let C > be fixed, x <E M. be bounded, and 
e be such that n~ c ° < e < C for a sufficiently small fixed c > 0, and let E Xt c,e 
be the event that there are two eigenvalues Aj(M„), Aj(M„) in the strip S Xt c,e '■= 
{z e B(xy/n,C) : Im(z) < e} with i ^ j such that \\{M n ) - Xj{M n )\ < 2e. Then 
P(E Xi c, e ) "C e, where the implied constant in the <C notation is independent of e. 

Proof. In this proof all implied constants in the <C notation are understood to be 
independent of e. By a covering argument, it suffices to show that 

P(^B(xv^+t,10e) > 2) « e 2 

uniformly for all t = 0(1). 
We split 

^B(x^n+t,We) = ^ B{x^n+tM)s)rM + 2^V B ( a;v ^+t,10e)nC + 

so it will suffice to show that 

(42) P(^B(xVH+t,i(te)nH > 2) « e 2 
and 

(43) V{N B[x ^ +t . We)nC+ > 1) « e 2 . 

We first show (43). If we let H be a bump function supported on B(t, 20s) that 
equals one on B(t, lOe), then we have 

and so by Markov's inequality it suffices to show that 

(44) EX^ C+ « e 2 . 

By the insensitivity of (39) and the lower bound on e, it suffices to verify the claim 
when M n is drawn from the real gaussian distribution. (Note that the derivatives of 
H can be as large as 0(e^ ^), causing additional factors of 0(e^ ^) to appear in 
the error term created when swapping M n with the real gaussian ensemble, but the 
n~ c gain coming from the insensitivity will counteract this if c is small enough.) 
But we may expand the left-hand side of (44) as 

/ p^ 1 \xVn + z)H(z)dz. 

Using Lemma II we see that this expression is 0(e 2 ) as required. 

Now we establish (42). Let H be as before, we observe that X 2 H R — X X _ H 2^ 
is non-negative, and is at least 2 when N B r x ^ +t ^ 10e ) nR > 2 (in fact it is at least 
2( Ar B(s,Af+ M oe)nR))_ Thus it su ffi ces to show that 

(45) KV;.,/,: -V,.// . : • 
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Again it suffices to establish this when M n is drawn from the real gaussian distri- 
bution. But we may expand the left-hand side of (45) as 

f f p^°\xVn + y,xVn + y')H(y)H(y') dydy' . 
Jr Jr 

Using Lemma 11 again, we see that this expression is 0(e 2 ) as required. □ 

Remark 40. In fact, a closer inspection of the explicit form of the correlation 
functions reveals that one can gain some additional powers of e here, giving a 
stronger amount of level repulsion, but for our purposes any bound that goes to 
zero as e — > will suffice. 

From the symmetry of the spectrum, we observe that if E Xt c,e does not hold, then 
there cannot be any strictly complex eigenvalue Aj(M„) in the strip S Xt c,e, since in 
that case Aj(M n ) would be distinct eigenvalue in the strip at a distance at most 2e 
from Xi(M n ). In particular, we see that 

(46) P(^s.,o,.\[xVS-c,xVH+c] = °) = 1 - 

Informally, this estimate tells us that we can usually thicken the interval [xy/n — 
C, xy/n + C] to the strip S x ^c,e without encountering any additional spectrum. 

Fix e := n~ c ° for some sufficiently small fixed Co > 0. We can use (46) to sim- 
plify the expression (38) in two ways. Firstly, thanks to (46), (41), and Holder's 
inequality, we may replace each of the Gj in (37) with a function Gj that vanishes 
on the strip {z — Zj : |Im(z)| < e}, while only picking up an error of 0(e c ) for 
some fixed c > 0, which will be acceptable from the choice of e. By discarding 
the component of Gj below the strip, we may then assume Gj is supported on the 
half-space C + — Zj. In particular, we have 

Zj .Gj ,C-[- Zj -G j 

Also, by performing a smooth truncation, we see that we have the derivative bounds 
V a Gj = 0(£-°W) for all < a < 5. 

Secondly, by another application of (46), (41), and Holder's inequality, we may 
"thicken" each factor X Xit F t fi. by replacing it with X x , F ,, where Fi : C — > C is a 
smooth extension of Fi that is supported on the strip {z : |Im(z)| < e}, while only 
acquiring an error of 0(e c ) for some fixed c > 0. Again, we have the derivative 
bounds V a F t = 0(£-°W) for < a < 5. From the insensitivity of (39) (and 
using the n" c gain coming from insensitivity to absorb all 0(s^ ^) losses from 
the derivative bounds) we see that 

( 47 ) 1-1 J V / II V <: ! 

i=i j=i 

is insensitive, which by the preceding discussion yields (for cq small enough) that 
(38) is insensitive also, as required. This concludes the derivation of Theorem 12 
from Theorem 23 and Proposition 20. 
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6.3. Quick applications. As quick consequences of Theorem 2 and Theorem 12, 
we now prove Corollaries 10, 17 and 18. 

We first prove we prove Corollary 18. Let M n be as in that theorem. Set e := n~ c ° 
for some sufficiently small cq > 0. A routine modification of the proof of Lemma 
39 (or, alternatively, Theorem 12 combined with Lemma 11) shows that for any 
z e -6(0, 0(y/n)), one has 

E ( Ar T £) ) <<£4 

when |Imz| > e, if cq is small enough; in particular, the expected number of eigen- 
values in B(z, e) which are repeated is 0(e 4 ). We then cover 5(0, 3y/n) by 0(n/e 2 ) 
balls B(z,e) with |Imz| > e, together with the strip {z : \lmz\ < e}. By (46) 
(or Theorem 12 and Lemma 11) and linearity of expectation, the strip contains 
0(sy/n) eigenvalues. By [4], [25], the spectral radius of M n is known to equal 
(1 + o(l))y / n with overwhelming probability 13 . We conclude that the expected 
number of repeated complex eigenvalues is at most 

0(n/e 2 ) x 0(e 4 ) + 0{e^/n) + 0( n - lm ), 

which becomes 0(n 1_c ) for some fixed c > 0; a similar argument gives a bound of 
( n i/2- C ) for thc 

expected number of repeated real eigenvalues. The claim now 
follows from Markov's inequality. 

Now we prove Corollary 17. Let M n be as in that theorem. As mentioned previ- 
ously, the spectral radius of M n is known to equal (1 +o(l))v / n with overwhelming 
probability. In particular, we have 

E7V R (M„) = EJV [ _3^ 3v ^ ] (M„) + O(n- 10 °) 

(say). By the smooth form of Urysohn's lemma, we can select fixed smooth, non- 
negative functions F- , F + such that we have the pointwise bounds 

l[-2,2] < F- < l[-3,3] < F + < 1[_4,4]- 

By definition of p^ 1,0 ^, we observe that 

E^[- 2V S,2VS]( M ™)< / P im (x)F_(x/V^) dx 
Jr 

<EiV h3 ^(M n ) 

< f P {lfl \x)F+{x/^i) dx 
Jr 

<EAT [ _ 4v ^ i4VH] (M„). 

By smoothly partitioning F±(x/y/n) into 0(y/n) pieces supported on intervals of 
size 0(1), and applying Theorem 12 to each piece, we see upon summing that the 
two integrals above are only modified by 0(n 1 / 2_c ) for some fixed c > if we 
replace M n with a real gaussian matrix M' n . On the other hand, when M' n is real 
gaussian we see from Theorem 16 (and the spectral radius bound) that 

j2n 

EiV h2 ^ 2ys] (M;),E^ MySj4ys] (M;) = J— +0(1). 



Actually, for this argument, the easier bound of O(l) would suffice, which can be obtained 
by a variety of methods, e.g. by an cpsilon net argument or by Talagrand's inequality [49]. 
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Putting these bounds together, we obtain the expectation claim of Corollary 17. 
The variance claim is similar. Indeed, we have 

E7V R (M„) 2 = E7V hwys] (M„) 2 + 0(n- 90 ) 

(say) and 

EA r h2 ^,2^ ] (M„) 2 < f p^\x)F_{x/^i) 2 dx + [ [ p^\x,y)F_{x/^n)F_{y/^n) dxdy 
Jr Jr Jr 

<EN { _ wv ^(M n f 

< / p^°\x)F + (x/Vn) 2 dx+ f f p^°\x,y)F + (x/Vn)F + (y/Vn) dxdy 
Jr Jr Jr 

<E% w ^j(M„) 2 . 

From Theorem 12 and smooth decomposition we see that all of the above integrals 
vary by 0(n 1_c ) at most for some fixed c > if M n is replaced with a real gaussian 
matrix, and then the variance claim can be deduced from Theorem 16 and the 
spectral radius bound as before. 

Remark 41. A similar argument shows that in the complex case, the expected 
number of real eigenvalues is (^(n 1 / 2 ^), which can be improved to 0{rT A ) for any 
A > if one assumes sufficiently many matching moments depending on A. Of 
course, one expects typically in this case that there are no real eigenvalues whatso- 
ever (and this is almost surely the case when the matrix ensemble is continuous), 
but this is beyond the ability of our current methods to establish in the case of 
discrete complex matrices. 

Finally, we prove Corollary 10. Let M n ,z ,r be as in that theorem, and let M n 
be drawn from the complex gaussian matrix ensemble. Let e = o(l) be a slowly 
decaying function of n to be chosen later. Let R be any rectangle in B(0, 100^/n) of 
sidelength 1 x n~ £ , and let 3R be the rectangle with the same center as R but three 
times the sidelengths. By the smooth form of Urysohn's lemma, we can construct 
a smooth function F : C — > R + with the pointwise bounds 

Ir < F < 1 3R 

such that \V j F\ < n jE for all < j < 5. Applying Corollary 15 (to n~ 5e F), we 
conclude that 

/ F(z)p£\z) dz= f F(z)pW(z) dz + 0(n- c+5s ) 
Jc Jc 

for some absolute constant c. On the other hand, from (5) we see that J c F(z)pn ^ (z) dz <C 
n" e , since 3i? has area 0(n~ £ ). Since e = o(l), we conclude that 

f F(z)pW(z) 
Jc 

and in particular that 

(48) VN R (M n ) « n- e . 
A similar argument (with larger values of k) gives 

(49) EN Rl (M„) ...N Rk (M„) « n^. 
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whenever fc is fixed and R\ , . . . , arc lxn £ rectangles (possibly overlapping) in 
B(0,10(Vn). 

Now let G : C — > M + be a smooth function supported on B(z ,r + n~ £ ) which 
equals 1 on B(zo, r) and has the derivative bounds |V J 'G| <C n J ' e for all < j < 5. 
By covering the annulus B(z ,r + n~ e )\B(z ,r) by 0(r) rectangles of dimension 
1 x n~ e , we see from (48) that 

EA r B( Zo ,r+n- = )\B( Zo ,r)(M„) < 

and similarly from (49) one has 

VN B(zo . r+n - nXBM {M n ) k « r k n- k£ 

for any fixed fc. Since we are assuming r < n°W, we conclude (if e decays to zero 
sufficiently slowly) that 

' E ^B(zo,r+n-^)\B(z ,r)(Mn) k = o(l) 

for all fc. In particular, if we introduce the linear statistic 

v . EtiG(A,(M„))-r 2 

we see from the triangle inequality that the asymptotics 



2 



for all fixed fc > are equivalent to the asymptotics 

EI* EAT(0,1)|. 

Let X be the analogue of X for M„. From Theorem 9 and the preceding arguments 
we have 

EX k -> EiV(0,l)| 
and so it will suffice to show that 

El* - EX k = o(l) 

for all fixed fc > 1. By (50) and the hypotheses that 1 < r < n°W and e = o(l), it 
will suffice to show that 



E(]T G(A 4 (M„))) fc - E(£ G(A,(M„))) fc = ( r °(fc) n -<=+o(fce 



»=1 i=l 

for all fixed fc > and some fixed c > (which will in fact turn out to be uniform in 
fc, although we will not need this fact). Expanding out the fc th powers and collecting 
terms 14 depending on the multiplicities of the i indices, we see that it suffices to 
show that 

E Yl ° ai ( A n ( M «)) ' ' • ° ak ' ( A v ( M «)) - G<11 (M n )) . . . G a *' (X ik , (M„)) 

l<i\<...<i k i <n 

= 0{r o{k) rT c+o{ke) ) 



14 The observant reader will note that this step is inverting one of the first steps in the proof 
of Theorem 2 given previously, and one could shorten the total length of the argument here if 
desired by skipping directly to that point of the proof of Theorem 2 and continuing onwards from 
there. 
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for all fixed k' , a\, . . . , ay > 1 and some fixed c > 0, where k := a± + . . . + ay- But 
the left-hand side can be rewritten using (1) as 

/ (LI G(^)(P { nH^ ...,Z k )- P^^l, ■ ■ • , Z k )) d Zl ... dz k . 

One can smoothly decompose dljLi C!( z j) aj ) as the sum of 0(r°^n°^) smooth 
functions supported on balls of bounded radius, whose derivatives up to fifth or- 
der are all uniformly bounded. Applying Theorem 2 to each such function and 
summing, one obtains the claim. 

Remark 42. The main reason why the radius r was restricted to be 0(n ^) was 
because of the need to obtain asymptotics for k th moments for arbitrary fixed k. 
For any given k, the above arguments show that one obtains the right asymptotics 
for all r < n c l k for some absolute constant c > 0. If one increases the number of 
matching moment assumptions, one can increase the value of k, but we were unable 
to find an argument that allowed one to take r as large as n a for some fixed a > 
independent of k, even after assuming a large number of matching moments. 

7. Resolvent swapping 

In this section we recall some facts about the stability of the resolvent of Hcrmitian 
matrices with respect to permutation in just one or two entries, in order to perform 
swapping arguments. Such swapping arguments were introduced to random matrix 
theory in [11], and first applied to establish universality results for local spectral 
statistics in [56]. In [21] it was observed that the stability analysis of such swapping 
was particularly simple if one worked with the resolvents (or Greens function) rather 
than with individual eigenvalues. Our formalisation of this analysis here is drawn 
from [60]. We will use this resolvent swapping analysis twice in this paper; once to 
establish the Four Moment Theorem for the determinant (Theorem 23) in Section 8, 
and once to deduce concentration of the log-determinant for iid matrices (Theorem 
25) from concentration for gaussian matrices (Theorem 33) in Section 10. 

We will need the matrix norm 

PII(oo,i) = sup \a>ij\ 

l<ij<n 

and the following definition: 

Definition 43 (Elementary matrix). An elementary matrix is a matrix which has 
one of the following forms 

(51) V = e a e* a , e a e* b + e b e* a , ^/ rz ^e a e* b - \f^\e h e* a 

with 1 < a, b < n distinct, where e\, . . . , e n is the standard basis of C™. 

Let Mq be a Hcrmitian matrix, let z = E + irj be a complex number, and let V be 
an elementary matrix. We then introduce, for each t € R, the Hcrmitian matrices 

M t := M + -LtV, 
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the resolvents 

(52) R t = R t {E + iri) := {M t - E - iri)- 1 
and the Stieltjes transform 

s t := s t (E + it]) := — trace R t {E + irj). 
n 

We have the following Neumann series expansion: 

Lemma 44 (Neumann series). Let Mo be a Hermitian n x n matrix, let E E R, 
i] > 0, and t€l, and let V be an elementary matrix. Suppose one has 

(53) |t|||i2o|| (oo ,i)=o(v^). 
Then one has the Neumann series formula 

oo , 

(54) R t = Ro + y2(-^) j (RoV) j Ro 

with the right-hand side being absolutely convergent, where R t is defined by (52). 
Furthermore, we have 

(55) l|flt||(oo,i) < (l + o(l))||i?oll(co,i)- 

In practice, we will have t = n ^"' (from a decay hypothesis on the atom distribu- 
tion) and ||i?o||(oo,i) = i°' c °' (from eigenvector delocalization and a level repulsion 
hypothesis), where Co > is a small constant, so (53) is quite a mild condition. 

Proof. See [60, Lemma 12]. □ 



We now can describe the dependence of s t on t: 

Proposition 45 (Taylor expansion of s t ). Let the notation be as above, and suppose 
that (53) holds. Let k > 1 be fixed. Then one has 



(56) Si = so + n-^c^i + o^-Cfe+D^i^fc+in^ufc+i^ mindl^n^ ^, A)) 

where the coefficients Cj are independent of t and obey the bounds 

(57) |c J |«||i? ||f CO!l) niin(||i 1 'o||(ooa)A)- 
for all 1 < j < k. 



Proof. See [60, Proposition 13]. 



□ 



42 



TERENCE TAO AND VAN VU 



8. Proof of the Four Moment Theorem 
We now prove Theorem 23. 

We begin with some simple reductions. Observe that each entry &j of M n has size 
at most 0(n°^) with overwhelming probability. Thus, by modifying the distri- 
butions of the £ij slightly (taking care to retain the moment matching property 15 ) 
and assume that all entries surely have size 0(n°^'). Thus 

(58) ||M„|| (00;1) ,||M;|| (00il) «n°«. 

We may also assume that G is bounded by 1 rather than by n c ° , since the general 
claim then follows by normalising G and shrinking c as necessary; thus 

(59) \G(x 1 ,...,x k )\<l 
for all X\, . . . ,Xf- el. 

Fix M ni M' n . Recall that a statistic S is asymptotically {M ni M' n )-insensitive, or 
insensitive for short, if one has 

\S{M n )-S(M' n )\^n- c 

for some fixed c > 0. By shrinking cq if necessary, our task is thus to show that the 
quantity 

EG (log | det(M„ - Zl )\, . . . ,log | det(M„ - z k )\) 

is insensitive. 

The next step is to use (17) to replace the log-determinants log | det(M„ — z)\ 
with the log-determinants log | det W„ jZ |, where the W n . z are defined by (16). After 
translating and rescaling the function G, we thus see that it suffices to show that 

EG (log | det(W n . Zl )\, . . . , log | det(W„, z J|) 

is insensitive. 
We observe the identity 

log | det(W n . Zj )\ = log | det(W n , Zj - V^IT)\ - nlm f (>/=!»?) drj 

Jo 

for any T > for all 1 < j < k, where Sj(z) :— \ trace(M / „ iZj — z)~ x is the Stieltjes 
transform, as can be seen by writing everything in terms of the eigenvalues of W n>Zj . 
If we set T := n 100 then we see that 

log | det(W n . Z] - x/^TT)! = nlogT + log | det(l - n- 100 W n ^)\ 

= nlogT + 0(n- 10 ) 



15 Alternatively, one can allow the moments to deviate from each other by, say, O(n _1()0 ), 
which one can verify will not affect the argument. See [3, Chapter 2] or [36, Appendix A] for 
details. 
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(say), thanks to (58) and the hypothesis that Zj lies in B(0, (1 — S)y/n). Thus, by 
translating G again, it suffices to show that the quantity 



EG j (™ Im ^ s 3 {^-lT]) dri 



k 



3 = 1 J 



is insensitive. 



We need to truncate away from the event that W njZj has an eigenvalue too close 
to zero. Let \ : K — > K be a smooth cutoff to the region \x\ < n 3c ° that equals 
f for |a;| < n 3c °/2. From Proposition 27 and the union bound we have with 
probability 1 — 0(n~ C0+o ^) that there are no eigenvalues of W n<z . in the inter- 
val [— n 1_2c ° , n _1_2c °] for all 1 < j < k. Combining this with Proposition 29 and a 
dyadic decomposition, we conclude that with probability 1 — 0(n~ C0+o ^) one has 

\lms 3 {^n- l - ica )\ < n 2co +°« 

for all 1 < j < k. In particular, one has 

xQmSjiV^ln- 1 -**)) = 1 

with overwhelming probability. 

In view of this fact and (59), it suffices to show that the quantity 
(60) 

is insensitive. 
Call a statistic S very highly insensitive if one has 



EG (nlm£ aj(V=i V ) dr^j X ((lm Sj -(>/=ln- 1 - 4c °))* =1 ) 



\S(M n ) - S{M' n )\ «n 



-2-c 



for some fixed c > 0. By swapping the real and imaginary parts of the components 
of M n with those of M' n one at a time, we see from telescoping series that it 
will suffice to show that (60) is very highly insensitive whenever M n and M' n are 
identical in all but one entry, and in that entry either the real parts are identical, 
or the imaginary parts are identical. 

Fix M n , M' n as indicated. Then for each 1 < j < k, one has 

W nitJ =W ntXj , + ^V 

where £, are real random variables that match to order 4 and have the magnitude 
bound 

(6i) 

V is an elementary matrix, and W n ^ Zj fl is a random Hcrmitian matrix independent 
of both £ and To emphasise this representation, and to bring the notation closer 
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to that of the preceding section, we rewrite Sj as where 

s[ j \z) :=i- trace B^\z) 

and 

l4 j \z) := (W^.o + ^tV-z)- 1 . 
Our task is now to show that the quantity 

(62) EG^nlm^ s\ j) (V^lr]) dt^J X ^(ims^V^n- 1 - 400 ))^ 
only changes by 0(n~ 2 ~ c ) when £ is replaced by 

We now place some bounds on R^\z). 

Lemma 46 (Eigenvector delocalization). Let 1 < j < k, and suppose that we 
are in the event that x(lmsj ( \] — ln~ 1_4c ° ) ) is non-zero. Then with overwhelming 
probability, one has 

(63) sup||^' ) (v / ^T?7)ll(co,i)«« 0(co) 

and hence (by Lemma 44 an d (61), swapping the roles of £ and 0) 

(64) sup \\Bg\V=iv)koo,i) 

r)>0 

The bounds in the above lemma are similar to those from Proposition 31 (and 
Proposition 31 will be used in the proof of the lemma), but the point here is that 
the bounds remain uniform in the limit n — > 0, whereas the bounds in Proposition 
31 blow up at that limit. 

Proof. By hypothesis and the support of x, one has 

llms^V^n" 1 - 400 )! < n- 3co . 
The left-hand side can be expanded as 

n 1 

-2-4c ' 



and so we obtain the lower bound 

(65) A J (T^„^)»n- 1 - c "/ 2 

for all i. 



From Proposition 31, one already has 

sup ^(v^llcoo,!) «n°« 

??>l/n 

with overwhelming probability. In particular, for each 1 < j < k and n > 1/n, one 
has 



o(l) 
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Combining this with (65), we see that 



< ri 



O(c ) 



for all r/ > 0, 1 < j < k, and 1 < I < n. By dyadic summation we conclude that 

|2 



E 



e, u 



I "i I 



^(A,(^ Z3 ) 2 +r7 2 )V2 
for all ?7 > 1/n, and thus by Cauchy-Schwarz one has 



O(c ) 



1 " 



( e i«.)(4»i) 



O(c ) 



for all r\ > and 1 < j < ft, and 1 < l,m < n. But the left-hand side is the Im 



coefficient of R^\y/^ Try), and the claim follows. 



□ 



We now condition to the event that (64) holds for all 1 < j < k; Lemma 46 ensures 
us that the error in doing so is OA(n~ A ) for any A. Then by Proposition 45, we 
have 

4 1 

sf(V^lv) = 4 J) W-lv) +E^ n ~ i/2c ^) + 0(n- 5 / 2 +°( c °))min(l, — ) 

i=i n7? 

for each j and all rj > 0, and similarly with £ replaced by £, where the coefficients 
Sp enjoy the bounds 

Ic^l < n°( c °) min(l, — ). 
From this and Taylor expansion we see that the expression 

is equal to a polynomial of degree at most 4 in r\ with coefficients independent of 
n, plus an error of 0(n~ 5 / 2+ °( c °)), which gives the claim for c small enough. 

Remark 47. If one assumes more than four matching moments, one can improve 
the final constant c in the conclusion of Theorem 23. However, it appears that 
one cannot make c arbitrarily large with this method, basically because the Taylor 
expansion becomes unfavorable when c is too large. 



9. Concentration of log-determinant for gaussian matrices 



In this section we establish Theorem 33. Fix z e B(0, C); all our implied con- 
stants will be uniform in z$. Define a to be the quantity a :— |(|zo| 2 ~l) if|^o| < 1, 
and a :~ log |z | if |zol > 1- Our task is to show that log | det(M„ — z ^/n)\ con- 
centrates around |nlogn + an. 
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9.1. The upper bound. In this section, we prove that with overwhelming prob- 
ability 

log | det(M„ — z \/n) \ < - n log n + an + n°^ , 

which is the upper bound of what we need. In fact, the statement (which is based on 
the second moment method) holds for general random matrices with non-gaussian 
entries. 

Proposition 48 (Upper bound on log-determinant). Let M n — (£ij)i<i ,j<n be a 
random matrix with independent entries having mean zero and variance one. Then 
for any z e C, one has 

log|det(M„ - z ^fn)\ < ^nlogn + an + O{n o{l) ) 
with overwhelming probability. 

The key is the following lemma. 

Lemma 49. Let M n = (£,ij)i<i,j<n be a random matrix as above. Then for any 
z e C, one has 

(66) E| det(M„ - z ^/n~)\ 2 < n\ cxp(|z | 2 n.) 
for all z . When \z \ > 1, we have the variant bound 

(67) E|det(M n -z v/^)| 2 < n n+1 \z \ 2n . 



Proof. By cofactor expansion, one has 

n 

det(M„ - z \/n) = ^2 sgn(cr) - z a ^l a ( i)=i ) 

aeS n i=l 

where S n is the set of permutations on {1, . . . ,n}. We can rewrite this expression 
as 

E E ^ 

Ac{l,..,n) a€S„, A 

where S n ,A is the set of permutations a e S n that fix A, thus a(i) = i for all i e A, 
and 

Fa,* := (-^ov^) |A| II^W- 



As the ^ij are jointly independent and have mean zero, we see that EFA t *FA',a' = 
whenever (A, a) ^ (A\a'). Also, as the £y also have unit variance, we have 
E\F A ,a\ 2 = |z | 2|A| ^ |j41 - We conclude that 

E|dct(M„-Wn)| 2 = Yl E M 21 ^ 1 - 41 - 

Ac{l,...,n) ireS„,A 

Write j = \A\. For each choice of j = 0, . . . ,n, there are choices for A, and 

(n — j)\ choices for a. We conclude that 

E| det(M„ - z v^)| 2 - n\J2 

3=0 h 



UNIVERSALITY FOR NON-HERMITIAN MATRICES 



47 



(This formula is well known in the literature; see e.g. [14, Theorem 3.1].) Since 



E 

3=0 



\z \ 2 ini 



exp(> | 2 n) 



we obtain (66). 



Now suppose that \zq\ > 1, then the terms 



IfoT 



are non-decreasing in j, and 



are thus each bounded by \zo\ 2n n n /n\, and (67) follows. 



□ 



From Lemma 49 and Stirling's formula, we see that 

E|det(M„ - z Q ^/n)\ 2 < cxp(n logn + 2cm + 0(n o(1) )) 
and thus by Markov's inequality we see that 

|det(M„ - z Vn)\ 2 < exp(nlogn + 2cm + 0(n o(1) )) 
with overwhelming probability, which gives Proposition 48 as desired. 



9.2. Hessenberg form. To finish the proof of Theorem 33, we need to show the 
lower bound 



1 



log | det(M„ - z Vn)\ > -nlogn + an - O(n o{1 >) 

with overwhelming probability. As we shall see later, the fact that we only seek 
a one-sided bound now instead of a two-sided one will lead to some convenient 
simplifications to the argument 16 . 

Now we will make essential use of the fact that the entries are gaussian. The first 
step is to conjugate a complex gaussian matrix into an almost lower-triangular form 
first observed in [33], in the spirit of the tridiagonalisation of GUE matrices first 
observed by Trotter [62], as follows. 



Proposition 50 (Hessenberg matrix form). [33] Let M n be a complex gaussian 
matrix, and let M' n be the random matrix 

( £ll Xn-l,C 
£21 £22 Xn-2,< 

£31 £.32 ^33 



ML 



Xr, 



\ 





Xi,< 

Cm 



J 



£(n-l)l £(n-l)2 £(n-l)3 £(n-l)4 
\ £nl £n2 £,n3 C«4 

where &j for 1 < j < i < n are iid copies of the complex gaussian N(0, l)c, a-nd 
for each 1 < i < n — 1, Xi.C * s a complex \ distribution of i degrees of freedom (see 
Section 3 for definitions), with the £jj and \i.c being jointly independent. Then the 
spectrum of M n has the same distribution as the spectrum of M' n . 



16 If one really wished, one could adapt the arguments below to also give the upper bound, 
giving an alternate proof of Proposition 48, but this argument would be more complicated than 
the proof given in the previous section, and we will not pursue it here. 
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The same result holds when M n is a real gaussian matrix, except that £jj are 
now iid copies of the real gaussian N(0, 1)r, and the Xi,c are replaced with real \ 
distribtions Xi,R with i degrees of freedom. 

Proof. This result appears in [33, §2], but for the convenience of the reader we 
supply a proof here. We establish the complex case only, as the real case is similar, 
making the obvious changes (such as replacing the unitary matrices in the argument 
below by orthogonal matrices instead). 

The idea will be to exploit the unitary invariance of complex gaussian vectors 
by taking a complex gaussian matrix M n and conjugating it by unitary matrices 
(which will depend on M n ) until one arrives at a matrix with the distribution of 
M' 

Write the first row of M n as (£n, . . . , £i„). Then there is a unitary transfor- 
mation U\ that preserves the first basis vector ei, and maps (£n, ■ • • , to 
(£iij Xn-i.C; 0, . . . , 0), where Xn-i.c is a complex \ distribution with n — 1 degrees 
of freedom. If we then conjugate M n by U\, and use the fact that the conjugate of 
a gaussian vector by a unitary matrix that is independent of that vector, remains 
distributed as a gaussian vector, we see that the conjugate U\M n U'{ to a matrix 
takes the form 

Kii Xn-i,C ... 0\ 

£.21 £22 £23 ■ • • t,2n 

Cn2 £n3 ■ • • £nn / 

where the ^ coefficients appearing in this matrix are iid copies of N(0, l)c (and 
are not necessarily equal to the corresponding coefficients of M„), and Xn-i,C is 
independent of all of the £jj . 

We may then find another unitary transformation [/ 2 that preserves e\ and e^, and 
maps the second row (£21, • • • ,6™) of UiM n U^ to (£21, 62, Xn-2,c, 0, • • • ,0), where 
Xn-2,c is distributed by the complex x distribution with n — 2 degrees of freedom. 
Conjugating U\M n Ul by U2, we arrive at a matrix of the form 





Xn-l,C 








... \ 


61 


£22 


Xn-2,C 








61 


&2 


£33 


64 




\U 


6i2 


£n3 




£,nn J 



where the coefficients appearing in this matrix are again iid copies of iV(0, l)c 
(though they are not necessarily identical to their counterparts in the previous 
matrix UiM n U~i) 7 and Xn-i,C an d Xn-2,c & re independent of each other and of the 
Iterating this procedure a total of n — 1 times, we obtain the claim. □ 

We now use this conjugated form of the complex gaussian matrix M n to describe 
the characteristic polynomial det(M n — z y/n). 
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Proposition 51. Let zq be a complex number, and let M n be a complex gauss- 
ian matrix. Let Xi,C> • • • > Xn-i,C be a sequence of independent random variables 
distributed according to the complex \ distributions with 1, ...,n— 1 degrees of 
freedom respectively. Let £i , . . . , be another sequence of independent random 
variables distributed according to the complex gaussian N(0, l)c, and independent 
of the Xi ■ Define the sequence a\ , . . . , a n of complex random variables recursively 
by setting 

(68) ai := £i - zq^/u 



for i = 1, . . . ,n — 1. (Note that the ai are almost surely well-defined.) Then the 
random variable 



has the same distribution as det(M„ — zo^/n). 

The same conclusions hold when M n is a real gaussian matrix, after replacing £j 
with copies of the real gaussian N(0, l)c, and replacing Xi.C with a real x distribu- 
tion Xi,R with i degrees of freedom. 

We remark that in [33] a slightly different stochastic equation (a Hilbert space 
variant of the Polya urn process) for the determinants det(M„ — z y/n) were given, 
in which the value of each determinant was influenced by a gaussian variable whose 
variance depended on all of the determinants of the top left k x k minors for k = 
1, . . . ,n — 1. In contrast, the recurrence here is more explicitly Markovian in the 
sense that the state a i+i of the recursion at time i + 1 only depends (stochastically) 
on the state ai at the immediately preceding time. We will rely heavily on the 
Markovian nature of the process in the subsequent analysis. 



Proof. Again, we argue for the complex gaussian case only, as the real gaussian 
case proceeds similarly with the obvious modifications. 

By Proposition 50, det(M„ — zo^/n) has the same distribution as det(M^ — zo^/n). 
The strategy is then to manipulate M' n — zoy/n by elementary column operations 
that preserve the determinant, until it becomes a lower triangular matrix whose 



and 



(69) 





diagonal entries have the joint distribution of 
point the claim follows. 



(yj\°i\ 2 + xl 



»=i 



a n , at which 
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We turn to the details. Writing £1 := £n, we see that M' n — z ny /n can be written 

as 



/ Oi Xn-l,C ... \ 

£21 62 - zoVn Xn-2,C ... 

6l 62 63 - Zov 7 " Xn-3,C • • • 

£(n-l)l £(n-l)2 £(n-l)3 £(n-l)4 ••• Xl,C 

\ £nl £n2 £n3 £n4 • • • £nn ~ Z y/nJ 



Note that there is a unitary matrix {/i whose action on row vectors (multiplying on 
the right) maps (on, Xn-i,C, 0, . . . , 0) to (yVi| 2 + xl-i,c 0, . . . , 0), and which only 
modifies the first two coefficients of a row vector. This corresponds to a column 
operation that modifies the first two columns of a matrix in a unitary fashion (by 
multiplying that matrix on the right by U\). Because complex gaussian vectors 
remain gaussian after unitary transformations, we see (after a brief computation) 
that this transformation maps the second row (£21, £22 — z y/n, Xn-2,c, 0, . . . , 0) of 
the above matrix to a vector of the form 

*, , = + ?2, Xn-2,C, • ■ • , 



where £2 is a complex gaussian (formed by some combination of £21 and £22) and * is 
a quantity whose exact value will not be relevant for us. By (69), we may denote the 
second coefficient of this vector by ai . The remaining rows of the matrix have their 
distribution unchanged by the unitary matrix Ui, because their first two entries 
form a complex gaussian vector. Thus, after applying the Ui column operation to 
the above matrix, we arrive at a matrix with the distribution 



KI 2 +X*-i,C 
61 















&2 Xn-2,C 

£32 £33 - zoVn Xn-3,< 



£(n-l)l £(n-l)2 £(n-l)3 £(n-l)4 ••• Xl,C 

\ £nl £n2 £n3 £n4 • ■ • £,nn ~ ZqV^J 



where the £jj here are iid copies of N(0, l)c that are independent of ai, a 2 , and the 
Xi.c (and which are not necessarily identical to their counterparts in the previous 
matrix under consideration). Of course, the determinant of this matrix has the 
same distribution as the determinant of the preceding matrix. 

In a similar fashion, we may find a unitary matrix U2 whose action on row vec- 
tors maps (*, 01, Xn-2,c, 0, . . . , 0) to (*, ^\a 2 \ 2 + X 2 l _ 2 ,o °> • ■ • > °)' and which only 
modifies the second and third coefficients of a row vector. Applying the associated 
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column operation, and arguing as before, we arrive at a matrix with the distribution 











' t2 | 2 +Xl-2,C 






«3 






Xn-3,C 



V 



£(n-l)l 



£(n-l)2 
£ra2 



£(ra-l)3 £(n-l)4 
£n3 £n4 



\ 



Xi,c 



where again the values of the entries marked * are not relevant for us. Iterating 
this procedure a total of n — 1 times, we finally arrive at a lower triangular matrix 
whose diagonal entries have the distribution of 

(\/Vl| 2 +xLl, Cl \J\ a 2\ 2 + Xn-2,C> • • • ' \/K-l| 2 +Xl, C . «") 

and whose determinant has the same distribution as that of M' n — z^^/n or M n — 
ZQyJn. The claim follows. □ 



9.3. A nonlinear stochastic difference equation. For the sake of exposition, 
we now specialize to the complex gaussian case; the case when M n is a real gaussian 
is similar and we will indicate at various junctures what changes need to be made. 

From Proposition 51, we see that log | det(M„ — zo^Jn)\ has the same distribution 

as 



(70) 



^ 71—1 

- iog(N 2 + xl-i,c) + lo g KI- 



Ci) 



It thus suffices to establish the lower bound 

1 ™ _1 1 
(71) - ^log(|a 4 | 2 + xl-t,c) +log|a„| > -nlogn + an-n° 

with overwhelming probability. 



We first note that as the distribution of log | det(M„ — ZQ^fn)\ is invariant with 
respect to phase rotation zq i-> zoe^~^ e , we may assume without loss of generality 
that z is real and non-positive, thus 

\z \y/nai 



(72) 



a l+ i := 



i+i- 



Remark 52. In the real gaussian case, one does not have phase rotation invariance. 
However, by making the change of variables a[ := ciie^^ 110 one can obtain the 
variant 



(73) 



W\\fna'i 



^\ 2 + xl 



+i 



to (72), where := e ^~^ t8 Ci+i- It wm turn out that this recurrence is similar 
enough to (72) that the arguments below used to study (72) can be adapted to (73); 
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the £j are no longer identically distributed, but they still have mean zero, variance 
one, and are jointly independent, and this is all that is needed in the arguments 
that follow. 



The random variable Xn-i c nas mean n — i and variance n — i. As such, it is 
natural to make the change of variables 

Xn-i,C =:n-i + Vn- ir] n -i 

where the rji, ... , ?7 n _i have mean zero, variance one, and are independent of each 
other and of the £j. 

Remark 53. For real gaussian matrices, the situation is very similar, except that 
the error terms r\ n -i now have variance two instead of one. However, this will 
not significantly affect the concentration results for the log-determinant in this 
paper. (This will however presumably affect any central limit theorems one could 
establish for the log-determinant, in analogy with [60], though we will not pursue 
such theorems here.) 



We now pause to perform a technical truncation. As the are distributed in a 
gaussian fashion, we know that 

(74) sup \&\<n°^ 

l<i<n 

with overwhelming probability. Similarly, standard asymptotics for chi-square dis- 
tributions also give the bound 

(75) sup \ Vl \<n°^, 

l<i<n 

with overwhelming probability (this bound also follows from Proposition 35). 

We may now condition on the event that (74), (75) hold (for a suitable choice of the 
o(l) decay exponent). Importantly, the joint independence of the £i, . . . , £„, rji, ... , r\ n -\ 
remain unchanged by this conditioning. Of course, the distribution of the & and 
r\i will be slightly distorted by this conditioning, but this will not cause a difficulty 
in practice, as the mean, variances, and higher moments of these variables are only 
modified by O(n~ 10 °) (say) at most, and also we will at key junctures in the proof 
be able to undo the conditioning (after accepting an event of negligible probability) 
in order to restore the original distributions of & and 77, if needed. 



We return to the task of proving (71). We write (72) as 

\z \y/nai 

\J\di\ 2 + n - i + y/n - ir)~i 



(76) a i+1 := + 



We will treat this as a nonlinear stochastic difference equation in the m. If we 
ignore the diffusion terms r? n _j,^ + i, we see that (76) is governed by the dynamics 
of the maps 
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as i increases from 1 to n — 1. In the regime i < (1 — |zo| 2 )«, we see that this 
map has a stable fixed point at zero, while in the regime i > (1 — |z | 2 )n, this map 
has an unstable fixed point at zero and a fixed circle at \a\ = yJ\zo\ 2 n — (n — i). 
This suggests that \a>i\ should concentrate somehow around for i < (1 — |zo| 2 ) n 
and around yj\z \ 2 n — (n — i) for i > (1 — |z | 2 )n. In particular, this leads to the 
heuristic 

N 2 + Xn-i,c ~ max(n - i, \z \ 2 n). 
Note from the integral test that 

- ^ logmax(n — i, |zp| 2 n) = - / logmax(n — t, \z \ 2 n) dt + 0(n ^) 

2 i=l 2 ^! 

(78) = ^nlogn + an + 0(n o(1) ), 

where the second identity follows from a routine integration (treating the cases 
| zo | < 1 and |zo| > 1 separately). This gives heuristic support for the desired 
bound (71). 

We now make the above analysis rigorous. Because we are only seeking a lower 
bound (71), the main task will be to obtain lower bounds that are roughly of the 
form 

N 2 + Xn-i,c ~ max(n - i, \z \ 2 n) 

with overwhelming probability. In the "early regime" i < (1 — |z | 2 )n, we will 
be able to achieve this easily from the trivial bound |aj| > 0. In the "late regime" 
i > (1— ko| 2 )", the main difficulty is then to show (with overwhelming probability) 
that (Xj avoids the unstable fixed point at zero, and instead is essentially at least as 
far away from the origin as the fixed circle \a\ — \/\zo\ 2 n — (n — i). 

We turn to the details. We begin with a crude bound on the magnitude of the 
quantities Oj. 

Lemma 54 (Crude lower bound). Almost surely (after conditioning to (74), (75) ), 
one has 

(79) sup \<n\ < (1 + l-JoDv^ 

l<i<n 

and with overwhelming probability 

(80) inf |a;| > exp(-n o(1) ). 

Ki<n 



Proof. From (68), (74) we see that we have 

| o-i | < 2y/n. 

From (72) (trivially bounding Xn-i from below by zero) we have 

k+il < l^+il 

and so the bound (79) follows from (74) and the assumption that \z \ < 1. 
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Now we prove (80). Let A > be fixed. Observe that £i has a bounded density 
function (even after conditioning on (74)), so from (68) we have 

Kl > n~ A 

with probability 17 1 — 0(n~ 2A ). In a similar spirit, for any i = 1, . . . , n— 1, has 
a bounded density function, so from (72) or (76) (after temporarily conditioning a, 
and rj n -i to be fixed) that 

\0i+i\ > n~ A 

with probability 1 — 0(n~ 2A ). By the union bound, we conclude that 

inf \cii\ > n~ A 

l<i<n 

with probability 1 — 0(n~ 2A+1 ). Diagonalising in A, we obtain the claim. □ 

From this lemma, we conclude that 

(81) logK| =n°« 

with overwhelming probability for each 1 < i < n. To show (71), it thus suffices to 
establish, for each fixed e > 0, that 

1 ™ _1 1 

-]Tlog(H 2 +xL*,c) > ^nlogn + an-Oin ^) 
i=i 

with overwhelming probability, where the implied constant in the 0(e) notation is 
understood to be independent of e of course. 

In view of (78), it will suffice to show that 

(82) Yl (log(k| 2 +xL,, c )-logmax(n-*,|z | 2 n)) > -0{n°^) 

n e <i<n—n £ 

with overwhelming probability, as the contributions of the i within n E of 1 or n can 
be controlled by O(n £+0 ^) thanks to Lemma 54. 

9.4. Lower bound at early times. We partition J2 n ^<i<n~n £ 

logmax(n— i, |z | 2 n)^ into two parts, according to the heuristics following (77). The 
following simple lemma handles the first part of the partition. 

Lemma 55 (Concentration at early times). One has 

£ log(| a ,| 2 +x I 2 l _,x)-logmax(n-z, |z | 2 n) > ^0(n°^) 

n e <i<min((l-|z | 2 )n+|zo|n 1/2+E ,n-n e ) 

with overwhelming probability. 



17 In the real gaussian case, the n 2A factor worsens to n A , but this does not impact the 
final conclusion. 
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Proof. We abbreviate the summation as ^2 i . The key observation here is that we 
need only a lower bound, so we can use the trivial inequality 



l0g(H 2 +Xn-i,c) > l0gX„-i,C- 

It suffices to show that 

(83) ^|log(n-i) -logmax(n-i, \z a \ 2 n)\ = 0(n o(e) ) 

i 

and 

(84) Yl lQ g *£-i,c " l0 §(« - = °( n ° {£) ) 

i 

with overwhelming probability. 

We first verify (83). The summand is only non-zero when i = (1 — |zo| 2 )« + j for 
some < j < min(|z |n 1 / 2+e , \z \ 2 n — n e ), and so one can bound the left-hand side 
of (83) by 

|log(M 2 n-. 7 )-log(M 2 n)|. 

0<j<min(|z |n 1 / 2 + e ,|z o | 2 n-n e ) 

When j < |zo| 2 n — n e , we may bound 

| log(|z | 2 n - j) - log(|z | 2 n)| « n^ 1 )-^, 

and the claim then follows by summing over all < j < |zo|« 1 ^ 2+e - 

Now we verify (84), which is quite standard. Writing Xn-i c = n — i + y/n — irj n -i, 
we can write the left-hand side of (84) as 



£]og(l + -£=M. 

^ ' a / n — i 



\Jn — i' 

From Taylor expansion and (75) we then have 

io g (i + ^=i) = + o(— ). 

n — i \Jn -% n — i 
The sum of the error term is acceptable, so it suffices to show that 

Vn - i 

with overwhelming probability. But this follows 18 from Proposition 35. □ 

18 Strictly speaking, Proposition 35 does not apply directly because the mean of the random 
variables r\ n —i deviates very slightly from zero when the conditioning (75) is applied. However, one 
can first apply Proposition 35 to the unconditioned variables r\ n — j, and then apply the conditioning 
(75) that is in force elsewhere in this argument, noting that such conditioning does not affect the 
property of an event occuring with overwhelming probability. 
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Remark 56. Following the heuristics after (77), it would be more natural to con- 
sider rf < i < (1 — |zo| 2 )n. The extra term |zo|n 1 / 2+e in the upper bound of i 
is needed for a technical reason which will be clear in the analysis of larger i (see 
Lemma 58). 

9.5. Concentration at late times. Define 

(85) i := max(n £ , (1 - \z \ 2 )n + \z \n 1/2+£ ). 

In view of Lemma 55, we see that to prove (82) it now suffices to establish the lower 
bound 



(86) l0 §(H 2 + *£-i,c) - !°S(N 2 n) = 0(n c 



with overwhelming probability. In fact, we only need the lower bound from (86), but 
the argument given here gives the matching upper bound as well with no additional 
effort. 

Let us first deal with the easy case when 
(87) \zo\ < n- l ' 2+i00e 

(say). In this case, there are only O(n 800e ) terms in the sum, and from Lemma 54 
(discarding the non-negative Xn-i c term) each term is at least — 0(n ^), so the 
claim (86) follows immediately. (Note that the summation is in fact empty unless 
\ z o\ > n~ 1//2+£ / 2 , so the log(|z | 2 n) term is 0(n°^).) Thus, in the arguments below 
we can assume that 

|*b | > 1/2+4006 • 



Observe from (72) that 

iog(H 2 + xU c ) - iog(M 2 n) = log |o<+1 ,"f <+l12 - 



From telescoping series and (81) we have 

E iog^ = OK«) 



with overwhelming probability, so by the triangle inequality it suffices to show that 

£ log |a *Y~^ l|2 = o("° (g) ) 

io<i<n— n £ 

with overwhelming probability. We can rewrite 

- &+i| 2 _ u , 6+1 1-2 



l«i+i| 2 < 



where 

(89) &i := a l+1 - & +1 = — — 

VKI +Xn-l,C 
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It suffices to show that 

£ log|l + ^| = 0(n°M) 

io<i<n—n £ 1 

with overwhelming probability. 

The heart of the matter will be the following lemma. 
Lemma 57. With overwhelming probability 

(90) |<|»n- 100e (z-(l-|z | 2 )n) 1 /2 
holds for all %q < i < n — n e . 

Assuming this lemma for the moment, we can then use it to conclude the proof 
as follows. For any io < i < n — n e , one has 

(91) (» - (1 - M» 1/2 > (to - (1 - M» 1/2 > (Mn 1 / 2 ^) 1 / 2 > n 200£ 
by (85) and (88), and thus by Lemma 57 

K| » n 100e 

with overwhelming probability. From this and (74) we see that 

1^1 = 0(1); 

indeed, the same argument gives the more precise bound 

i^Kn^-a-izoivr 1 / 2 . 
a i 

Performing a Taylor expansion (up to the second order term), we conclude that 
log |1 + %i | = Re^i+i/aJ + 0(n°^(i - (1 - M 2 )™)- 1 ) 

a i 

with overwhelming probability. 

The error terms 0(n°^(i — (1 — |z | 2 )n) _1 ) sum to 0(n ^), so it suffices to show 
that 

(92) £ ^ = °(-° (£) ) 

io<i<n—n s % 

with overwhelming probability. But from (90), one has 
a i 

with overwhelming probability. Also, the coefficient depends on £1,. . . and 

Xi,C) • • • ) Xn,C an d is independent of £»+i,...,£ n , so the sum in (92) becomes a 
martingale sum 19 . The claim then follows from Proposition 35. 



19 Again, strictly speaking one should apply Proposition 35 to the unconditioned variables and 
then apply the conditioning (74), (75), as in Lemma 55. 



58 



TERENCE TAO AND VAN VU 



It remains to prove (90). From (72), (89), (74) we have 
a'i = a i+ i - = a i+1 + (9(n o(1) ) 
and so by (91) it will suffice to establish the bound 

(93) |a l |»n- 99e (»-(l-|z | 2 )n) 1 / 2 
with overwhelming probability for each i < i < n — n £ + 1. 

In order to prove (93), let us first establish a preliminary largeness result on a^, 
which uses the diffusive term in (72) to push this random variable away from 
the unstable equilibrium of the map (77): 

Lemma 58 (Initial largeness). With overwhelming probability, one has 

(94) sup \ai\ > A. 

max(i -5|zol« 1/2+e ,0)<i<io 

where A is the quantity 

A^lzol^V/^/ 10 . 

Proof. Suppose first that 

*o-^Mn 1/2+e <o. 

By (85), this implies that \z a \ » 1, and then from (68), (74) we have |ai| n 1 / 2 , 
which certainly gives (94) in this case. Thus we may assume that 

to - \\zo\n 1/2+e >0. 

It will suffice to show that, for each integer 

*o - \\z \n 1/2+e < h <i 

and each fixed (i.e. conditioned) choice of , ^ and Xn-i,Ci ■ ■ • )Xn-iu one 

has 

(95) sup \ai\ > A 

J 1 <i<il + |zo|n 1 /2 + e/2 

with conditional probability at least q for some fixed q > 0. Indeed, we can choose in 
the interval [i a — ^\z \n 1 ^ 2+e ,i — \z \n 1 ^ 2+e ^ 2 ] at least initial points ii,...,i m 
so that the distance between any two of them is at least |z |^ 1 ^ 2+e ^ 2 - If we let Ej 
for j = 1, . . . , m be the event that (95) holds with i\ replaced by ij, then the above 
claim asserts that after conditining on the failure of the events Ei,... , the 
event Ej holds with conditional probability at least q. Multiplying the conditional 
probabilities together, we then obtain (94) with a failure probability of at most 

(l_ g) « E/2 /4 

which is 0(n~ A ) for any fixed A > as required. 
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Fix i - i|z |n 1/2+e < h < i and and Xn-i,c, ■ ■ ■ , Xn-h,c; all prob- 

abilities in this argument are now understood to be conditioned on these choices. 
The quantity a il is now deterministic, and we may of course assume that 

(96) K|<i4 

as the claim is trivial otherwise. We may also condition on the event that (75) hold. 
Let ii := \ i\ + |zo|n 1 / 2+£ / 2 J . Our goal is to show that 

P( sup \ai\ > A) > 1. 

ii<i<i2 

For technical reasons (having to do with the contractive nature of the recursion 
(72) when a, becomes large), it will be convenient to replace the random process 
<Xj by a slightly truncated random process 5j for iq < i < i\, which is defined by 
setting a,i 1 := a il and 

(97) a i+ i := = + 



for i\ <i<%2- From an induction on the upper range ii of the i parameter, we see 
that 

sup |aj| < A -i==^> sup \cii\ < A 

i\<i<%2 i\<i<%2 

and in particular 

|a i2 | > A =^> sup \a,i\ > A. 

ii<i<t2 

Thus it will suffice to show that 

(98) P(|oi 2 |>A)»l. 

By a standard Paley-Zygmund type argument, it will suffice to obtain the lower 
bound 

(99) E|5 J2 | 2 » |z |ra 1/2+£/2 
on the second moment, and the upper bound 

(100) E|5 42 1 4 « M V +£ + |^ |ri 1/2+£/2 E|a i2 1 2 

on the fourth moment. Indeed, if p denotes the probability in (98), then from 
Holder's inequality one has 

E|^ 2 | 2 «A 2 +p 1 /2 (E |^|4 ) l/ 2 

and then from (100) and (99) (and the definition of A) we obtain p >• 1 as required. 

It remains to establish (99) and (100). For this, we will use (97) to track the 
growth of the moments Eja^ | 2 , E|5i| 4 as i increases from i\ to i 2 . 



Let ii < i < i 2 . From (97) we thus have 



e|5 j+ i| 2 = e 



2 

I z Q I y/ncii 



yjmin(\a,i\, A) 2 + n-i + ^n- ir]^Z l 



+ 6 



+i 
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The quantity has mean O(n~ 10 °), variance 1 + O(n~ 10 °) (the O(n~ 10 °) errors 
arising from our conditioning to (74)), and is independent of the other random 
variables on the right-hand side. Thus (using (79)) we have 



Ela 



»+i| 



E 



\z \y/ndi 



\J min(|ai |, A) 2 + n - i + y/n - irj n ^) 



+ l + 0(n- yu ). 



Upper bounding min(|aj|, A) by A and n — i by |2o| 2 \/^ — ko|^ 1 ^ 2+£ /2, and using 
(75) (which we recall that we have conditioned on), we conclude that 



min(|oj|, A) 2 + n- i + Vn- ir\ n -i < \z \ 2 n. 



This implies that 
(101) 



E\a t+1 \ 2 > E|a,| 2 



l + 0(r 



Iterating this >• |z |n 1 / 2+£ / 2 times, we obtain (99) as required. 
Now we turn to (100). Again, we let i\<i< ii. From (97) we have 

\z \y/na,i 



E|5 m | 4 = E 



\J min(|fij|, A) 2 + n — i + \Jn — irj n -i 



•e 



t+i 



Expanding out the left-hand side using the independence and moment properties 
of we can estimate the above expression as 



E 



\z \^/nai 



ymin(|ai|, A) 2 +n — i + \f: 



+ o 



E 



V 



n - ir] n -i 
zo\^/na t 



yj min(|aj|, A) 2 + n — i + \Jn — ir) n - 



Using (74), (75) and the bound n — i> \z \ 2 n — O(|z |r7, 1 / 2+e ), and discarding the 
non-negative min(|aj|, A) 2 term, we then obtain the upper bound 

(102) E|£ m | 4 < (1 + O(|z r 1 n- 1/2+e ))E|a J | 4 + 0(E|5,| 2 + 1), 

via a routine calculation. From (101) we have 



ElaJ 2 < Eja,J 2 . 



From (96) we also have 



Ela, 



< \zo\ n 



if we then iterate (102) O^z^n 1 /^ 6 / 2 ) times, we obtain (100) as desired. 



□ 



Now we need to use the repulsive properties of (77) near the origin to propagate 
this initial largeness to later values of i. The key proposition is the following. 
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Proposition 59. Let z < h < «2 < n — n e /2. Let E il ^ 2 be the event that 

Wi\ < \\Ji - 
probability that 



\ai\ < \\Ji — (1 — |zo| 2 ) n / or fl'i «i < * < «2- T/ien we /iaue wii/i overwhelming 



/or some constant c > 0. 

Proof. The probability in question will be computed over the product space gen- 
erated by Cij 7 ?! with «i < i < z 2 , conditioning all the other £i,r/i to be fixed. In 
particular, a il is now deterministic. 

For any i\ < i < i 2 , we see from (76) that 

(103) a i+1 = P.a, + 
where is the positive real number 

fl ._ kolV" 

Pi ■ i 

y |a;| 2 + n - i + yfn^irin^i 
Next, from iterating (103) we have 

y ii<i<i2 

where 7 lljl2 := Ai • • • A 2 -i and K,i : = ■ • ■ Pi 1 - 
As the event E il , i contains E ilyi2 for i\ < i < i 2 , we have 

(104) o»2 1 B il ,i 2 =Tii,i2 1 £ il ,i 2 (aii + X! hiAi+^Ei^i)- 

ii<i<i 2 

Notice that if E ilti holds, then 



\a,\ 2 <\^-(l-\zo\ 2 )n) 



which is equivalent to 



3 

Kl 2 + n - i < \z \ 2 n - -(i - (1 - |z | 2 )n). 



On the other hand, since 

i - (1 - \z \ 2 )n > *i - (1 - M> > \zo\n 1/2+£ /2 
and n — i < \z \ 2 n, we deduce from (75) that 

|ai| 2 + n - i + yjn- ir) n _i < \z \ 2 n - * (i - (1 - |z | 2 )n) 
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(say) if n is large enough. This gives a bound of the form 

ft>l + e'- ( '-'^ ) ">l + e"-' 1 -^> 
\zo\ n \zq\ z n 

for some absolute constants c > 0. 
From the definition of 7, , we conclude the lower bound 

<«*) hM,iuw, * (^'^Jf^V^ 



and the upper bound 



(106) l*ii,i|lB 4lll < 1b (1i4 < I- 

Let us now make a critical observation that the random variable S^^Ie^ ( de- 
pends on £ 2 , • • • , & (and on the \i,c, Xn-i,c) but is independent of ...,£„. 
This enables us to apply Proposition 35, from which wc can conclude that with 
overwhelming probability 



(107) KAE il ^i + i=0(n o ^\i 2 -i 1 \ 1 / 2 ) = 0(n°^Vi^ r k), 

ii<i<i 2 

concluding the proof. □ 

Corollary 60. Assume that \a^\ > rfl^T 1 / 2 where T := [ n _ l ( ^ ol 2 )n log 2 nj . 
Then 1 E T = holds with overwhelming probability. 

Proof. Assume, for contradiction, that there is a fixed A such that P(1e t ) > rT A . 
By the previous lemma, we can assume that 



|a il+T |i Sii , ii+T > + ^ |a j t Zb|2) " ) r (KI + Q(" o(1) Vr)i g<1 , <1+r ) 

holds with probability at least 1 — n~ 2A . Taking expectations, we conclude 
EK +T | > E\a il+T \l EiiM+T > (l+ Cil "[^ Z ° |a) " ) r (EKI+O^Vf))^- 

Since \a n \ > tf/^T 1 / 2 and (1 + ^"fcff ' )n ) T > exp(clog 2 n) for some fixed 
c > by the definition of T, the RHS is bounded from below by 



n A exp(clog 2 n) n. 
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On the other hand, from Lemma 54 we have that 

E|a Jl+T | < (I + \zo\)y/n y/n, 
yielding the desired contradiction. □ 

Next, we observe that at cannot drop in magnitude too quickly once it is somewhat 
small (assuming the hypotheses (74), (75), of course): 

Lemma 61. If\ai\ < \\fi — (1 — |zo| 2 )n then \a,- L \ > |oj_i| — n°W. 
Proof. From (72) we have 

_ , _ \zo\Vn 

a i Si — r, p5 — ; a i—l- 

Vl a i-l| + Xn-»+l,C 



and hence 



K-l| 2 + Xn-i+l,C 

We can rearrange this as 



z o\ 2 n | ,2 I t ,2 



I |2 _ Xn-i+l,C I _ t |2 

1 1-11 " M 2 n-k-6l 2 ' 4 &l ' 

By (75) we have 

Xn-i+i,c =n-i + Oiyjn - m o(1) ) = n - i + O(n o(1) \z \^), 
using the fact that in this range n — i < \z a \ 2 n. 

From the assumption of the lemma, we have that 

\0i - 6I 2 <\{i-{±- M» + 0{n°Wy/i-{\-\zo\*)n) 

and thus 

X„-,+i,C-ko| 2 n+|a 2 -6| 2 < ~{i-(l-\z \ 2 )n)+0(n°^\z o y n )+0{n°^^i - (1 - M». 

As i — (1 — |z | 2 n) > |zo|" 1 ^ 2+£ , w e see that the right-hand side is negative for n 
large enough, thus 

Xn-i+l,C < ^ 

|z | 2 n- |oj -&I 2 ~ 

We thus have 

|a»-i| < |a» — €ii U 

which implies from (74) that \(n\ > |aj_i| — n W as desired. □ 

We can now prove the lower bound (93) with overwhelming probability as follows. 
We first condition on the event that the conclusion of Lemma 58 holds. Now assume 
that there is some i < i < n — n e such that 
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N < -0-(l-|z O |>. 
Let i 2 be the first such index. In particular, 

(108) \a l2 \< 1 - 02 - (1 - \z \ 2 )n < l - 2 - (1 - |z | 2 )n. 

By Lemma 58, we can then locate an index max(io — \ |zo|n 1 / 2+e , 0) + 1 < i\ < i 2 
such that | a^l < — (1 — l^ol 2 )™ for all i\ < i < i 2 (or in other words, E ilti2 
holds) and 

|o»i-i| > 7,01 - 1 - (1 - M 2 H 
From Lemma 61, this implies in particular that 

(109) K| > .49901 - (1- M 2 )n. 

From the above discussion and the union bound, it thus suffices to show that for 
any given i < i\ < i 2 < n — n £ , the event that (108) and (109) and E il , i2 all 
simultaneously hold, is false with overwhelming probability. 

Fix ii,i 2 - If ii — ii > T then by Corollary 60, 1e 4i t — with overwhelming 
probability and we are done. In the other case i 2 — h < T, by Proposition 59, we 
have with overwhelming probability 



(no) K|i JS<1 , <a >(i + cil |^2^ |2)n )' a "(l^l + OK* 1 ^))^,,. 

It now sufHccs to verify that if \a^\ > .49901 — (1 — |z | 2 )n, ^»i,i 2 holds, and 
|oj 2 1 < |02 — (1 — l z o| 2 )«, then the above inequality is violated. Notice that since 
«2 - h < T = ii _ ( l ^° l | z "| 27t) log 2 n and h - (1 - \z \ 2 )n > |zo|« 1/2+£ , wc have 

\a il \+0{n°^s/i 2 ~^h > -4990! - (1 - |z | 2 )n-O(n°( 1 )r 1 / 2 ) > ^0! - (1 - |z | 2 )n. 
As E il . i2 holds, it follows that the RHS of (110) is at least 

A0 1 -(l-|z o |2)„>l0 2 -(l_|z o |2 n) 

again thanks to the fact that i 2 — i\ < T = o{i\ — (1 — |z | 2 )n). Our proof is 
complete. 

Remark 62. All the above arguments go through without difficulty in the real 
case, using (73) instead of (72), replacing a^^Xi.c by G^,£i,Xi,iR respectively; we 
leave the details to the interested reader. 
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10. Concentration of log-determinant for iid matrices 



Now that we have established concentration of the log-determinant in the special 
case of real and complex gaussian matrices (Theorem 33), we are now ready to 
apply the resolvent swapping machinery from Section 7 to obtain concentration for 
more general iid matrices (Theorem25). 

Fix 6, zq. Let W niZo be defined as in (16). As in the previous section, set a equal 
to |(|^o| 2 — 1) if \ z o\ < L an d log \z \ if \z \ > 1. It suffices to show that 

Iog|det(W niZ0 )| =2na + 0(n°«) 

with overwhelming probability, uniformly in z . We may assume without loss of 
generality that all entries of M n are 0(n°^). 

We observe the identity 

log | det(W„. Z0 )| = log | Aet{W n , Za - V^1T)\ - nlm [ 8 (V=irj) dn 

Jo 

for any T > 0, where s(z) := - trace {W n ,z ~ z ) _1 is the Stieltjes transform, as 
can be seen by writing everything in terms of the eigenvalues of W n>Zo . If we set 
T := n 100 then we see that 

log|det(W„,* ->/=TT)| =nlo g r + log|det(l-n- 100 ^ )| 

= nlogT + 0(n- 10 ) 

(say), thanks to (58) and the hypothesis that \zj\ < ^/n. Thus it suffices to show 
that 

nlm s(y^ln) dn ^ nlogT - 2na + 0(n° W ) 
Jo 

with overwhelming probability. 

Now we eliminate the contribution of very small n. 
Lemma 63. One has 

,1/n 

nlm / s(v /Z T?7) dr) = 0{n o(1) ) 
Jo 

with overwhelming probability. 



Proof. From Proposition 31 we see with overwhelming probability that 

| S (y r Tn)|«n° (1) (l+— ) 
nn 

for all r\ > 0. This already handles the portion of the integral where n > n~ 21og ™ 
(say). For the remaining portion when < n < n" 21og ™, we observe from Propo- 
sition 27 that with overwhelming probability, all eigenvalues of W n ,z are at least 
n~ log " in magnitude, which implies that s(y/—ln) = O(n 1+logn ) for all such n, and 
the claim follows. □ 
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Set X := nlm J^ n s(\/ —lr))dr) and X* := nlogT — 2na. Fix arbitrary constants 
A, e > 0. In view of the above lemma, it suffices to show that 

P(\X-XJ >n £ )«n- A 

By Markov's inequality, it suffices to show that for j = 2[A/e\ 

(111) E(X-X,Y =0(n^ 2 ). 

Without loss of generality we may assume j to be large, e.g. j > 5. By Theorem 
33, we know that a stronger bound 



(112) E(X' -X*) j < n e 

holds for the same range of j (for n sufficiently large depending on e and j), where 
X' is defined as in X but with M n replaced by a random real or complex gaussian 
matrix M' n that matches M n to third order. 

We now execute the following swapping process. Start with the random gausian 
matrix M' n and in each step swap either the real or imaginary part of a gaussian 
entry of M' n to the associated real or imaginary part of the corresponding entry 
of M n . The exact order in which we perform this swapping is not important, so 
long as it is chosen in advance; for instance, one could use lexicographical ordering, 
swapping the real part and then the imaginary part for each entry in turn. Let M„ , 
< k < 2n 2 be the resulting random matrix at time k and define X^ accordingly. 
We will show, by induction on k, that 



(113) E( *M_ X .)'<(l + _^)n« 

for n sufficiently large depending on e and j (but not on k). Note that the base 
case k — of (113) holds thanks to (112), while the case k — 2n 2 implies (111) 
with some room to spare. 

For technical reasons, it is convenient to assume that |£|, = n W with proba- 
bility one. This can be done replacing all entries &j by CijI|^ i .|<io g B n an d dj by 
|<iog B n> wn erc B is a sufficiently large constant so that with overwhelming 

probability |^-| + |^ -| < \og B n for all It is clear that any event that holds 
with overwhelming probability in the truncated model also holds with overwhelm- 
ing probability in the original one. Thus, we can reduce to the truncated case. At 
this point we would like to point out that the truncation does change the moments 
of the entries, but by a very small amount that will only introduce negligible factors 
such as O(n~ wo ) to the swapping argument. Abusing the notion slightly, from now 
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on we still work with £ and £' but under the extra assumption that with probability 
one |f |, |f '| < log s n = n «. 

Fix a step < k < 2n 2 , and consider the difference 



(114) 

D k :=-E(xi k+ V-x*y-E(xW-x*y = J K[{x^ + ^-x«y-{x^-x«y]\M Q )dM . 

where Mq is obtained from X^ k+1 ^ by putting at the swapping position (in other 
words, M is the common part of and M^ k+Vi ), and dM is the law of M . 
Once conditioned on Mo, we can simplify the notation by replacing X^ and X^ k+1 ^ 
by X^ and X^> respectively. 



It is important to notice that since r\ > 1/n, we can bound \s^(y/— 177) | crudely 
by n with probability one (for any matrix M„^). As T — n 100 , this implies that 
\XW\ <n 102 and 



(115) \(xw - x ,y - (x^ - x,y\ « n 102 ^' 

for any j, with probability one. 

By Proposition 31, we see with overwhelming probability that 

l|i?e(v / =Tr ? )|| (co , 1) «n°( 1 ) 
for all 77 > rT 1 . In this case, by Lemma 44 and (61) 

(116) ||i?o(^ ? 7)||(coT)«« o(1) 
for all such 77. 

If (116) holds, we say that M is good. The contribution from bad M in the RHS 
of (114) is very small. Indeed, by Proposition 31, we can assume that M is bad 
with probability at most n -102 - 3-100 . By the upper bound (115), the integral (in 
Dk) over the bad M is at most 

(117) n- 102 ^ 100 n 10 ^ - n- 100 . 

Let us now condition on a good M . By Proposition 45, we have 



(118) *£(>/=T»7) = S0 + E Cn- l/2 cM + 0(n- 2+0 « — ). 
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where the coefficient Ci(ri) is independent of £ and enjoys the bound |cj(?7)| <C 
Multiplying by n and taking the integral over 77, we obtain, 



(119) X 6 = X + P(0 + 0(n- 2+ °^) 

where P — C n ~ l ^ 2 di is a polynomial in £ with coefficients di = 0(n°W), and 

X is a quantity independent of £. As |£| = 7J W with probability one, it follows 
that — Xq\ = n~ 1 ^ 2+ °^ v> with probability one. Furthermore, 



(120) X 5 -X* = (X - X,) + P(0 + 0(rr 2+ °«). 

We raise this equation to the power j, focusing on those terms of order £ 4 or more. 
As di — 0(n°^), using the fact that |£| < with probability one and j > 5, we 
have 



i-i 

(121) (Xt - = Pj(£) + 0(n- 2+ °W J2 \X -X*\ l + n" 6 /^ 1 )). 

where P, is a polynomial of degree at most 3. Therefore, 



(122) E(A C - X.)' = Ei>-(£) + 0(n- 2 +°« ^ |A - X*\ k + n- 5 / 2 +°«). 

fe=i 

Similarly 



i-i 

(123) E(A> - X*y = EPj(?) + 0(n- 2+o ^ ^ |A - X*| fe + n- 5 / 2 + «). 

fe=i 

Here the expectations are with respect to £ and £' (as we already conditioned on 
a good Mo.) It follows that 

(124) 

E(AVA^-E(A>-A^' = E(P J (e)-P J (e'))+0(n- 2 +°( 1 ) ^ |X -X*| fe +n- 5 / 2+o «). 

fe=i 
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As already pointed out, the first three moments of £ and £' do not entirely match 
due to the truncation. However, by fixing B large enough, we can assume that the 
truncation changes each moment by at most n~ c for some sufficiently large C (we 
need C to be larger than the absolute value of the coefficients of Pj, which are of 
size 0(n°^), again thanks to the fact that |s £ (\/— T77) j < n with probability one). 
This yields 

j'-i 

(125) E(X £ - X,y - E(A> - = 0(n- 2+ °W ^ \X - X,\ k + n" 5 / 2 ^ 1 )). 

fe=i 

But |X £ - X | < n" 1 /^ 1 ) with probability one, so (114) implies 

j-i 

(126) EpQ - - E(A> - X*)i = 0(n- 2+o ^ ^ E\X ( — X*\ k + n - b ' 2+ °^). 

fc=i 

The right-hand side of (126) can be bounded as 

(127) 0{n- 2+o ^ mm{-E\Xs - X*\3n- £ / 4 i ,n £ / 2 }), 

where the bound comes from considering two cases E|JQ — X*p being not smaller 
or smaller than n E ^ 2 , and the Holder inequality. 

Thus, conditioned on a good M , we have 



|E(X £ - X.)' - E(A> - X«Y\ « n- 2 +°« min{|A £ - X* |%" £ / 4j ', n^ 2 }. 
Taking into account (117), we conclude 

D k « n- 100 + n- 2 ~ s / 4j E\X s - X^ + n -*+e/*+<>W , 

and the desired bound (113) on E(A[ fe+1 l — X*y follows easily by the induction 
hypothesis. 

Appendix A. Spectral properties of W n , z 

In this appendix we prove Proposition 29 and Proposition 31. We fix M n , C, z$ 
as in these propositions. By truncation we may assume that all the coefficients of 
M n have magnitude 0(n°^). 
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A.l. Crude upper bound. We begin with Proposition 29, which we will prove 
by modifying the argument from [56, Appendix C] and [57, Proposition 28]. Write 
I = [E — t],E + rj\. It suffices to establish the claim in the case 1/n < r\ < 1, as 
the general case then follows from this case (and from the trivial bound Ni < 2n). 
By rounding r\ to the nearest integer power of two, and using the union bound, 
it suffices to establish the claim for a single rj in this range, which wc now fix. 
Similarly, wc may round £ to a multiple of r\ ; since the claim is easy for (say) 
\E\> n 10 , we see from the union bound that it suffices to establish the claim for a 
single E, which we now also fix. By symmetry we may take E > 0. 

By a diagonalisation argument, it will suffice to show for each fixed c > that one 
has 

N[E-r h E+ V \ < n 1+C T] 

with overwhelming probability. Accordingly, we assume for contradiction that 

(128) N [E ^ E+n] > n 1+ % 
We use the Stieltjes transform 

s{E + y/^li]) = ±- trace(W n , 2 -E- v^) -1 . 

Then 

^ In 

lms(E + ^ = ^ § (x [wJ-ey + ^ 

from (128) we thus have 

Ims(E + V^li]) > n c . 

In particular, since 

j In 

s(E + V^lv) = ^ n T, R ( E + ^V)n 

we see from the pigeonhole principle that we have 

(129) \R(E + V^li]) j 3 \ >n c 

for some 1 < j < 2n. By the union bound, it suffices to show that for each j, the 
hypothesis (129) (combined with (128)) leads to a contradiction with overwhelming 
probability. 



Fix j; by symmetry we may take j = 2n, thus 
(130) \R(E + v^K^I » n c . 

We expand W n , z as 

fw X 

TXT _ yv n,z ^ 

where W' n z is the 2n — 1 x 2n — 1 Hcrmitian matrix 

/ ^{M^-zY 

W' n>z := Z 

V^(M„_ 1 -z)* Z* 
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where M„_i is the top left n-lxn-1 minor of M n , Z is the n — 1-dimensional 
row vector with entries ^£nj for j = 1, . . . , n — 1, X is the 2n-dimensional column 
vector 

/ i X ' 
V 

and X' is the n— 1-dimensional column vector with entries -^£jn for j = 1, • ■ • , n— 1. 



By Schur's complement, the resolvent coefficient + V—lr)) 2n ^n can be ex- 
pressed as 

(131) R(E + x/=T7?) 2 „,2„ = — — 

where F„ is the expression 

By (130) we conclude that 

\E + V=lri + Y n \ < n~ c ; 
as F„ has a non- negative imaginary part, we conclude that 

(132) ImY„ < n~ c . 

Next , we apply the singular value decomposition to the n x n— 1 matrix ^ V™ ^ ^ 1 

generating an orthonormal basis of n right singular vectors m, . . . , u n in C™, and 
an orthonormal basis of n — 1 left singular vectors in C n_1 , associated to singular 
values <7i, . . . , ct„ (with ct„ = 0). Then 2 is conjugate to the direct sum 

and thus 



and thus 



Jee — - ■ ^ 



2 



J 



2~E E \E-ea^ + V ^ X * U ^ 



where 



is the top half of X. 



X := 



X' 

(£rm - z) 
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By (128) and the Cauchy interlacing law, we may find an interval [j_, j+] of length 
j + — j_ >• n 1+c i] such that \aj — E\ < r\ for all j_ < j < j + . We conclude that 

£ \X* Ul f«n-% 

j-<j<3+ 



At this point we will follow [19] and invoke a concentration estimate for quadratic 
forms essentially due to Hanson and Wright [29] , [64] . 

Proposition 64 (Concentration). Let £i, . . . ,£„ be iid complex random variables 
with mean zero, variance one, and bounded in magnitude by K for some K > 1. 
Let X G C™ be a random vector of the form Y + Z, where 




and Z is a random vector independent of Y . Let A = (aij)i<ij< n be a random 
complex matrix that is also independent ofY. Then with overwhelming probability 
one has 

X*AX = - tmcc A + Z*AZ + ( K 2 log 2 n(-M|| F + -4=11^11 + 4=11^*^11) I 
n \ n y/n V n / 

where \\A\\p '■= d<ij< n | 2 ) is the Frobenius norm of A. 



We remark that for our applications, one could also use Talagrand's concentration 
inequality [49] as a substitute for this concentration inequality, at the cost of a 
slight degradation in the bounds; see e.g. [56]. 



Proof. By conditioning we may assume that Z, A are deterministic (the failure 
probability in our estimates will be uniform in the choice of Z, A). Let £j := 
From [19, Proposition 4.5] we have 

a iMo= a lJ EM 1 +0(\\A\\ F \og 2 n) 

l<ij'<n l<i.j<n 

with overwhelming probability. Multiplying by K 2 /n and noting that E&£j = li=j, 
we conclude that 

Y*AY=±tr m A + o(¥^\\A\\ F ) 
n \ n J 

with overwhelming probability. Meanwhile, from the Chernoff inequality we see 
that 

Y*AZ = 0(^\\AZ\\) 

and similarly 

Z*AY = o(^\\A*Z\\) 
with overwhelming probability. The claim follows. □ 
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Applying Proposition 64 (with A equal to the projection matrix A := X^j_<j<j + u j u j)i 
one has 

£ l**«,| a = J " + "^ + V ||^7r( ew )|| a +0(n- 1 +°WC3+-j-+l) 1 / 2 )+0(n- 1 ^ 

j-<j<j+ 

with overwhelming probability. By the arithmetic mean-geometric mean inequality 
one has \\^(e n )\\ 2 + 0{n- 1 / 2+0 ^\\^Ti{e n )\\) > - n - 1+o{1 \ and we conclude that 

£ |l*^| 2 »n c ?7 

j-<3<3+ 

with overwhelming probability (conditioning on M„_i, Z). Undoing the condition- 
ing, we thus obtain a contradiction with overwhelming probability, and Proposition 
29 follows. 



A. 2. Resolvent bounds. We now prove Proposition 31, by using a more compli- 
cated variant of the arguments above. We first take advantage of the fact that the 
spectral parameter \f—\r\ is on the imaginary axis to make some minor simplifica- 
tions. Namely, we have 

Note from (16) that W 2 Z + t] 2 is block-diagonal, and thus W n ^ z (W 2 z + rj 2 )^ 1 
vanishes on the diagonal. We conclude that R(V— and s(\/— It]) are purely 
imaginary (with non-negative imaginary part) for 1 < j < n, with 

(133) Im^v^Tr?) = ^- trace (VK 2 Z + rj 2 )- 1 = ^ trace((M„-z)*(M„-z) + r/ 2 )- 1 . 

ATI Ti 

Now we observe that it suffices to verify the claim for 77 > n~ 1+c for each fixed c. 
To see this, observe that 

2n I 12 

for any 1 < j < 2n, where u\, . . . , um arc an orthonormal basis of eigenvectors for 
W n , z , and Ukj is the j th coefficient of Uk- Thus, if we can obtain Proposition 31 
for rj > n~ 1+c , we conclude with overwhelming probability that 

for all i] > n~ 1+c , and hence that 

£ | Ufej | 2 «n°«r ? 

l<fe<2«:A fc (W„ ji )<) ) 

for all 77 > n~ 1+c . This implies that 

£ |«/ £J | 2 «n°( 1 )(r, + n- 1+c ) 

l<fc<2n:A fc (IV„,,)<r f 
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for all r\ > 0. By dyadic summation (using the crude upper bound \k(W n . z ) 
0{n o( -V)), this implies that 



In 



for all 77 > 0. Similarly with Ukj replaced by Ufc,i- By Cauchy-Schwarz, we conclude 
that 

In 1 

I V U *<3 U ^ I ^ n c+o(l) fl , J_) 

l ^A fe ^)- v ^Tr / l<<n (1+ V 

for any r\ > 0. The left-hand side is The claim then follows by using a 

diagonalisation argument. 

A similar argument reveals that we may assume without loss of generality that 
r\ is an integer power of two. Note that the above argument shows that one only 
needs to verify the diagonal case i = j; by symmetry and the union bound we may 
take i = j = 2n. The claim is trivially verified for r\ > n 10 (say), so we may assume 
that rj lies between n~ 1+c and n 10 ; by the union bound, we may now consider rj as 
fixed. By diagonalisation (and the imaginary nature of the resolvent), it will now 
suffice to show that 

(135) Imi?(v^T?7)2„,2„ « " c+o(1) 

with overwhelming probability. 

From (131) (and the fact that R(^/—\rf)2 n ,2n is imaginary) we have 

1 



(136) Imi?( v / -1??)2„,2„ = 



i] + ImY n 
where 

From the block-diagonal nature of W n z as before we see that Y n is purely imaginary, 
with non-negative imaginary part; indeed, we have 

(137) lmY n = T]X*(AA* +r 1 2 )- 1 X 

where A is the n x n — 1 matrix 



M n _! - z 



A :-- 

Thus we have the crude bound 

(138) Imi?( V /Z Tr;) 2 „ i2 „ < ^ 

which already takes care of the case when rj is large (e.g. rj > n~ c ). 

On the other hand, we see from Proposition 64 that with overwhelming probability 
one has 

X*(AA* + r? 2 )- 1 ! = 1 trace(AA* + rj 2 )- 1 + ^-e* n {AA* + r/ 2 )- 1 e„ 

+ 0(n- 1+o W\\(AA* + v^'Wf) + 0{n- 1+ °W\z\\\{AA* + rfY 
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From the spectral theorem one has 

\\(AA* + V 2 r 1 e n \\ < «(M +)) 2 )- 1 ^) 1/ V 1 
and thus by Young's inequality (or the arithmetic mean-geometric mean inequality) 

n- 1+ °W\z\\\(AA* + vT^J = o(^e* n (AA* + r? 2 )"^) + 0( n - 1+ °W n~ 2 ) . 
Also, we may expand 

n 

where &\ (A) , . . . , a n (A) are the n singular values of A (thus one of these singular 
values is automatically zero) . From Proposition 29 and the Cauchy interlacing law, 
we see with overwhelming probability that for any interval [— r, r], the number of 
singular values of A in this interval is 0(n°^(l + nr)). From dyadic summation 
we then see that 

(139) \\(AA* +v 2 r 1 \\F « n o(1) (m/) 1/2 A? 2 - 

Similarly, one has 

n 1 

trace(AA* + rj 2 )- 1 = V - ? 

and thus by interlacing 

n 

traced* + r? 2 )" 1 = Y — — - 2 = + O(^). 

But from (133) we have 

n 

n 



and thus 



E (T . (Mn _^ ) 2 + r? 2 v <^) 



(140) ^ trace(AA* + n J )~ l = sU-ln) + 0( — ). 

Putting all this together with (137), we see that with overwhelming probability 
one has 

, | 2 |2 „o(l) „o(l) 

ImF„ = lms(V=l V ) + (1 + (1))^<(^* + r ? 2 )- 1 e„ + 0( ) + 0( — ), 



n n?7 ^/nr/ 

which, in view of the lower bound 77 > n~ 1+c , simplifies to 

(141) lmY n = Ims(V^lv) + (1 + o(l))^r]e* n (AA* + r? 2 )-^ + o(l). 

Now we evaluate the expression e* n {AA* + r] 2 )~ 1 e n . Observe that 

AA +?? - ^ r(M„_!-2)* yr* + r, 2 . J- 
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By Schur's complement, we thus have 
e* n {AA*+rfr 1 e n 



YY* + V 2 - F(M„_! - z)*((M„_i - z)(M„_j - z)* + ry 2 )- 1 ^-! - z)Y* ' 

One can simplify this using the identity 

B*(BB* + f] 2 )-^ = 1 - V 2 (B*B + V 2 )-\ 

valid for any matrix B (which can be seen either from the singular value decom- 
position, or by multiplying both sides of the identity by (B*B + ij 2 j) to conclude 
that 

2-1-1, 



V e* n (AA* + V 2 ) L e n = 



r, + r?F((M„_ 1 - z)*(M„_i - z) + r/ 2 )-iy* ' 
Applying Lemma 64, we see with overwhelming probability that 

viY({M n _i - z)*(M„_! - z) + Tf)- l Y* = 77 trace((M„_ 1 - z)*(M„_ 1 - z) + r? 2 )" 1 

n 

+ Oin-^vUM^ - z)*(M„_! - z) + ,? 2 || F ). 

By mimicking the proof of (139), one has 

||(M„_! - z)*(M„_! - z)+r 1 2 \\ F « n°( 1 )(n» 7 ) 1 / 2 /'? 2 
with overwhelming probability. Similarly, by mimicking the proof of (140) one has 

^ trace((M„_i - z)*(M„_! - z) + ry 2 )" 1 = Ims(v^T??) + 0( — ). 
n n?7 

Putting these bounds together, we conclude that 



77 + lms(^/^lr]) + o(l) 

with overwhelming probability; inserting this back into (141) and (136) we conclude 
that 

(142) Imi?( V /r T?7)2n In = 1 n^T 

V + Iras(V^v) + (1 + o(l)) v+l J^ )+o{1) + o(l) 
with overwhelming probability. 

Suppose now that |z| 2 /n > 1/2. Then we have 

|z| 2 /n, 1 

y 

for any y; this implies that the denominator in (142) has magnitude ^> 1, which 
gives (135). Thus we may assume that |z| 2 /n < 1/2. 

The bound (142) similarly with the index 2n replaced by any other index. Aver- 
aging over these indices, we obtain the self-consistent equation 

— \ 2n 1 

(143) Wv 73 !^) = — V rrr, 

2^ r, + Ims(^lv) + (1 + +Il J^ )+o(1) + °U) 
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with overwhelming probability. If we write x := r\ + Ims(v / — if]), we thus have 

1 



= — Y 



2n^ a: +(i + (i))ite+ (i) 



+ 7? 



'x+o(l) 

with overwhelming probability. Note that either x — o(l) or x + o(l) = (l + o(l))x. 
In the latter case, we can simplify the above equation as 



i 

— Y 



n 



2n f-f x + |z|2/n 
and thus 

x= (l + o(l))x 
a; 2 + \z\ 2 /n 

In particular, this forces x 2 + \z\ 2 /n > l+o(l). Since we have assumed that \z\ 2 /n < 
1/2, we conclude that x > 1/2 (say). We conclude that for each n~ 1+c < rj < n 10 , 
we have 

Ims(V— if]) + V = °(1) 

or 

Ims(y/^l7]) +7] > 1/2 
with overwhelming probability. Rounding r\ to the nearest multiple of (say) n~ 100 
and using the union bound (and crude perturbation theory estimates) , we conclude 
with overwhelming probability that this dichotomy in fact holds for all rT 1+c < 
i] < n 10 . On the other hand, for 77 = n 10 , one is clearly in the second case of the 
dichotomy rather than the first. By continuity, we conclude that the second case 
of this dichotomy in fact holds for all n~ 1+c < 77 < n 10 ; in particular, we have with 
overwhelming probability that 

Ims(V^lTi) > 1 

when n~ 1+c < rj < nT c . Inserting this bound into (142), we conclude with over- 
whelming probability that 

Imi?(v /r T77)2n,2n < 1 

when n _1+c < r/ < n~ c , which gives Proposition 31 in this case. Finally, the case 
7] > n~ c can be handled by (138). 

Remark 65. A refinement of the above analysis can be used to give more precise 
control on the Stieltjes transform of W UyZ , as well as the counting function Nj. See 
[3] for more details. 



Appendix B. Asymptotics for the real Gaussian ensemble 



The purpose of this appendix is to establish Lemma 11. Our arguments here will 
rely heavily on those in [7] . 

By reflection we may restrict attention to the case when z\ , . . . , zi lie in the upper 
half-plane C+. Our starting point is the explicit formula 



#•'>(*!, ...,X k ,Z U ..., Zl ) = Pf [ K A^'\ K A^ z ?\ 



l<*,i'<fe;l<j,i'<i 
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for the correlation functions, where K n : (MUC+) x (MUC+) — > M 2 (C) is a certain 
explicit 2x2 matrix kernel obeying the anti-symmetry law 

(144) K((,C) = -K(C,0 T , 

making the expression inside the Pfaffian Pf an anti-symmetric 2(k + I) x 2(k + I) 
matrix; see [7, Theorem 8]. In view of this formula, we see that Lemma 11 will 
follow if we can establish the uniform bound 

K n ((,£) = 0(l) 

for all (,(' € MUC+. 

To do this, we will need the explicit description of the kernel K n . Following [7], 
we will need the partial cosine and exponential functions 

«/2-l 2m 

C„/ 2 (7) ■= 



(2m)! 

e„/ 2 (7) : = 

as well as the function 



m=0 
n-2 



m=0 



er z < 2 I ~p 2 (n - 3 ^ 2 _, n-l x 2 

r n/2 (z,x) := -j^-y crfc(v / 2Imz) _ ^ sgn(x)z" 7 (— ^— , — ) 



where crfc := 1 — erf is the complementary error function and 

l{t,x)= ( y^e^ dy 
Jo 

is the incomplete gamma function. In [7, Theorem 8], the formula 

f DS n {i,1) 5(7, V) \ 



^n(7,7') := 



-5( 7 ', 7) /5M„( 7 ,7 / )+^(7,7 / ), 



is given for the kernel K n , where £(7,7') is equal to | sgn(7 — 7') when 7,7' 
are real, and equal to otherwise, and the scalar quantities DS n {p/,*f), 5(7,7'), 
ISM n (j, 7'), arc defined by the following formulae, depending on whether 7, 7' are 
real or complex: 

(1) (Real-real case) If x,x' € K, then 

e -(x-x') 2 /2 _ , 

5„(x,a;') := — e xx e n/2 (xx') +r n/2 (x,x') 

V Z7T 



e -(x-x'V/2 ; 

DS n (x,x') := = (or' - z)e x * e n/2 (xx') 

V 27T 



~ e-* 2 ' 2 r(*') 2 n e -t e -(^') 2 /2 ^ ! /2 e -i 

IS n (x,x') := sgn(x') / -= c n/2 (a;V2t) (it sgn(a:) / —=c n / 2 (x'v2t)dt 

2V 7r Jo V* -V 71 " Jo vi 
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(2) (Complex-complex case) If z,z' € C+, then 

~ i e _ 5 (z_ ^ 2 — / ^ ^ -7 — 

S n (z,z') := = (z' - z)ycrfc(v / 2Im(x))erfc(v / 2Im(z'))e~^ e n/2 {zz') 



2ir 

DS n (z, z') := 6 ^ } [z' - z)A/erfc(v / 2Im(z))erfc(v / 2Im(z'))e" zz 'e„ /2 (zz') 

V 27T 



/S^z') := - C 3 — (i 7 -z)A/crfc(v / 2Im(z))crfc(v / 2Im(z'))e-^'e„ /2 (ii 7 ). 

V 2 77 



(3) (Real-complex case) If x € M and z e C+, then 
/2t7 



~ ie 2 ( x z ^ I /— _ — _ 

S n (x,z) := -== — yerfc(V2Im(z))e B e„ /2 (K) 



e -i(x-^) 2 / — _ 
S n (z,x) := = — A/erfc(V2Im(z))e lz e n/2 (i:z) + r n/2 (z, x) 

V 27T 



— e-s^-^J / ~ _ 

DS n (x,z) := = — (z - xWerfc(V2Im(z))e X2 e„/ 2 (a;.z) 

V 27T 

~ ^ e -|(a:-2) 2 / — _ _ 

IS n (x,z) := = — ycrfc(V2Im(z))e xz e n/2 (xz) - ir n/2 (z, x). 

V 277 



As £ (7, 7') is clearly bounded, it thus suffices (in view of (144)) to show that all 
the expressions S n (x,x'), DS n (x,x'), IS n (x,x'), S n {z,z r ), DS n (z, z'), IS n (z,z'), 
S n (x,z), S n (z,x), DS n (x,z), IS n (x,z) are all 0(1) for x,x' el and z,z' G C+. 
This will be a variant of the estimates in [7, Section 9] , which were concerned with 
the asymptotic values of these expressions as n — > 00 rather than uniform bounds. 

We first dispose of the r„/ 2 terms. In the proof of [7, Corollary 9], the estimate 



\r n/2 {z,x)\ < ) V /crfc(v / 2Im(z)) 2?i/2 l (r ; /2 _ i)! 
is established for any x £ R and z e C + . Using the standard bound 

-x 2 

(145) erfc(a;) = °(^-^) 

for any x > 0, we thus have 

K /2 (z,*)|«e-N 2 /2_ ' 



2(n-i)/2(„/2- 1)!' 
But 2 ("-i)/2(n/2-i)! * s one °^ Taylor coefficients of el z l 2 / 2 , and so 

(146) r n/2 (*,a;) = 0(l). 
Thus we may ignore all terms involving r n / 2 . 

Now we handle the real-real case. Recall from the triangle inequality and Taylor 
expansion that 

(147) K/2(z)\ < e n/2 (\z\) < exp(|z|) 
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for any complex number z. Thus, for instance, we have 

\S n (x,x')\ < exp(-(x - x') 2 /2 - xx' + \xx'\) +1«1 
since the expression inside the exponential is either — (x — x') 2 /2 or —(a; + x') 2 /2. 

If one applies the same method to bound DS n (x,x'), one obtains 

Similarly one has 

\DS n (x,x')\ < | a; - x'\ cxp(-(.x - x') 2 /2 - xx' + \xx'\). 

This bound is O(l) when xx' is positive, but can grow linearly when xx' is negative. 
To deal with this issue, we need an alternate bound to (147) that saves an additional 
polynomial factor in some cases: 

Lemma 66 (Alternate bound). For any complex number z, one has 

M 1/2 

\\z\-z\ 

with the convention that the right-hand side is infinite when z is a non-negative 
real. 



Proof. The claim is trivial for \z\ < 1, so we may assume that \z\ > 1. Observe 
that 

n /2 m n/2+1 

(148) {]z ^ z)e {z)= J2 Z —(\z\-m) - 

z — ' to! 



(n/2)! • 



m— 

An application of Stirling's formula reveals that 

z m 1 

— X =0( W j- 2 eM\z\)) 

for all to, so the second term on the right-hand side of (148) is 0(\z\ ^^ 1/2 exp(|z|)). 
It thus suffices to show that 

«/2 



£ — (|z|-TO)=0(|z| 1 /2 cxp( | z |) ) . 



TO 
m— 



By the triangle inequality, the left-hand side can be bounded by 
^-f, m! *-f , to! 

m<|^| m>|js| 



This expression telescopes to 



to! 

where to := |_MJ- By Stirling's formula, this expression is Od^l 1 / 2 exp(|z|)) as 
required. □ 

Inserting this bound in the case when xx' is negative, we conclude that 

\DS n (x,x')\ « |^'| ( _^ex P Hx-a;')72^* >+\ xx >\) = J^+J^ exp((HH*'|) 2 /2) 
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and one easily verifies that this expression is O(l). 
Finally, to control IS n (x, x'), it suffices by symmetry to show that 

/•(*') 2 /2 -t 

(149) / — c n/2 (xV2t) dt = (9(exp(a; 2 /2)). 

Jo \t 

But by Taylor expansion we may bound c n / 2 (x^/2t) by cosh(xv / 2t)- Since 

J 7! cosh(xV2t) = \e {x) /2 (erf( ' '^J ' ) - erf ( 11 ' )), 

we see from (145) that the left-hand side of (149) is 

< exp((a;') 2 / 2 ) cx P(- max (l a; 'l - M,0) 2 /2) < cxp(a; 2 /2) 

as required. 

Next we turn to the complex-complex case. From (145) and (147) we see that 
\S n (z, z>)\ « exp(-lRe((z-7) 2 ))|7-z| (i ^^^^ff^ exp(M-Rc(; 
After some rearrangement, the right-hand side here becomes 

(1 + Im(z)) l V2(l | l m (z'))^ CXP(_ 2 (N ~ ^'^^ 

II 1/21/ I 1/2 

If one uses Lemma 66 instead of (147), one gains an additional factor of , ' ', 1 — , . 

|NI* \~ zz I 

Thus, it suffices to show that 
(150) 

(l + Im(,))U(l|l m ( z0)1 / 2 || z | M -sF| > CXp( -2 ^ - « 

By symmetry, we may assume that < Im(z) < Im(z'). We may assume that \z\ 
and | z' | are comparable and larger than 1, since otherwise the claim easily follows 
from the exp(— ^(\z\ — \z'\) 2 ) term. 

Let 9 denote the angle subtended by z and z' . Observe from the triangle inequality 
that 

(151) [z 7 - z\ < \\z\ - \z'\\ +Im(z) + \z\9 

and 

||z||z'| - zz?\ > \z\ 2 6. 

The first two terms on the right-hand side of (151) give an acceptable contribution 
to (150) (bounding the minimum crudely by 1), so it suffices to show that 

! — ! mm — — — ) <sr \ 

(l + Im(2)) 1 /2(i + l m ( 2 '))i/2 { '\z\ 2 6> ' 

but this is clear after discarding the denominator and using the second term in the 
minimum. This establishes the bound \S n (z, z')\ -C 1. Similar arguments, which 
we leave to the reader, show that \DS n (z, z')\ <C 1 and \IS n (z, z')\ <C 1. 
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Finally, we turn to the real-complex case. Using (147) and (145), we can bound 

, m / 1^ // ,9,i exp(— Im(z) 2 ) , . .. 

\S n (x, z)\ < exp(--Re((z - zf)) 1 ^ Im(z)1/2 ex P(~^ + MM)- 

The right-hand side simplifies to exp(— (x— |z|) 2 /2)/(l + Im(z) 1 / 2 ), which is clearly 

o(i). 

A similar argument (using (146)) shows that S n (x,z) — 0(1) and IS n (x,z) = 
0(1). The bound DS n (x,z) — 0(1) can be established by the same arguments 
used to handle the complex-complex case; we leave the details to the reader. This 
completes the proof of Lemma 11. 
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